Hey,
I have been writing python in my everyday work for the last 5 years or so. However, I never took a course on python. I want to prepare for some upcoming job interviews where I will be programming in python. How can I improve my ability to write data structure/algo code in python? I want to gain a similar understanding of what is "fast" in C++ (std::move, copy elison, etc.), but with python. Any suggested resources?
Some general tips for algorithmic complexity in Python:
- Python's list is equivalent to C++ std::vector. If you need to push/pop at the head of the list, use Python's "collections.deque" to avoid the O(N) cost. Python's standard library doesn't have a linked list implementation.
- Python's dict is a fast unordered hashmap. However, if you need order-aware operations like C++'s std::map::lower_bound(), you're out of luck; Python's standard library doesn't have a tree implementation.
- Python has a "bisect" module for binary searching in a Python list, and a "heapq" module for using a Python list as a heap. However, neither one is nicely encapsulated in a data type.
If your Python program is seriously CPU-bound, the normal solution is to use a C/C++/Rust extension. For example, if you're doing large numeric calculations, use NumPy; it can store a vector of numbers as a single contiguous array, which is much faster than a list-of-Python-floats.
If you want to parallelize across CPU cores, it's important to understand the Python GIL (global interpreter lock). Often you need to use multiple Python processes. See e.g. https://superfastpython.com/multiprocessing-pool-gil/
Maybe also worth reading about __slots__ (https://wiki.python.org/moin/UsingSlots); it's a micro-optimization that helps when allocating a lot of tiny Python objects.
Hope some of that is helpful! Good luck with your job interviews.
I'm fairly certain that in Python 3.7 and later standard library dictionaries are now ordered by default.
This might or might not be what you want/expect...
* Flattening data from hierarchical, custom types into flatter, builtin collection types can make a speed difference. The builtin type methods spend more time in the native code and are optimized.
* Lots of things that I thought could make a difference but would barely move the needle. There is so much pointer indirection internal to CPython.
* The set and frozenset types can offer easy and immediate algorithmic improvements. This is obvious from a formal/academic perspective but continues to be my favorite thing about writing quick and dirty python. I have encountered cases where an algorithm called for a fancy data structure but doing simple set intersection and difference was good enough.
* Over time, Python taught me to worry less about optimizing things that didn't matter. Everyone pays lip service to the perils of premature optimization but I think there are layers to the problem. Fast languages make many affordances for optimization, and to some extent compel you to make detailed choices that have performance implications (e.g. choice of integer width, choice of collection type). There is something very liberating about reminding myself "it's going to be slow either way, and it doesn't matter." When I work in Swift for example I find myself getting distracted by finer details that relates to efficiency.
This cannot be overstated. Really, truly, forget everything you know about cache locality when doing even basic "for-each: add numbers together"-type things in Python.
That's not an indictment; such indirection is par for the course for scripting languages, and enables a lot of features and dynamism. Additionally, the cost of that indirection may go down over time thanks to interpreter optimizations like the JIT: https://tonybaloney.github.io/posts/python-gets-a-jit.html
You make want to take a look at Mojo: https://www.modular.com/max/mojo
It's not really there yet, but the promise is to "combine the usability of Python with the performance of C"
In many cases, it is possible to get Python code to within 2-10x of C using numpy/numba/cython... but with 10% of the lines of code. In many cases, this is an acceptable trade.
It's not the purpose of the language but (business) needs don't generally care about such things. Sometimes you can't rewrite in rust for a variety of reasons and then it can pay off to know what's faster rather than slower in CPython.
Book: High Performance Python
Mostly that content has a scientific focus but the obvious thing that carries over to any part of Python is _profiling_ to figure out what's slow. Top tools I'd recommend are:
* https://pypi.org/project/scalene/ combined cpu+memory+gpu profiling
* https://github.com/gaogaotiantian/viztracer get a timeline of execution vs call-stack (great to discover what's happening deep inside pandas)
* my https://pypi.org/project/ipython-memory-usage/ if you're in Jupyter Notebooks (built on https://github.com/pythonprofilers/memory_profiler which sadly is unmaintained)
* https://github.com/pyutils/line_profiler/
Every Pycon has at least one talk on it and they are usually filled with a variety of tips and insights.
Prioritise by keynote speeches, recent ones, country level ones and then city Pycon.
https://pycon.org/
Many things that one might take as given in other languages, in Python are optional: static type analysis, multithreading, immutability and the like. When it comes to writing algorithms in Python, it's best to think about it as executable pseudocode, here to help you get to the correct execution first and optimise for performance later.
In Python, development happens in distinct steps. The first step is to get your code to do what you need, without thinking about the readability and efficiency. In many cases, this will result in the code that is ready to be used, even in production (especially if you include unit tests and type checking, for good measure); if not, the next step is to comb the code and optimise it to be faster. What that means exactly will depend on your needs, but with a combination of external libraries like NumPy, multiprocessing and async it is quite possible to reach sufficient performance.
Finally, if none of that result in the desired speed, the performance critical components can be extracted and rewritten in a low-level language (where you have a huge advantage as a C++ dev) that the Python will happily wrap around. This is the exact strategy used by most performant libraries out there, like NumPy itself.
I have no doubt that Golang is faster and more efficient for HTTP serving, but if you don't have a good specification for your app you're happy with for at least the next however long and need to keep moving crap around, I'd so much prefer Python.
And if I hear some flavor of "compile-time checks are tests/but Python doesn't check types" argument, that person shouldn't be involved in software development.
Have you ever even used a powerfull type system? I dont think so