The Mathematical Theory of Communication by Shannon and Weaver. It’s an extended version of Shannon’s original paper that established Information Theory, with some extra explanations and background. 144 pages.
SatvikBeri
Atiyah & McDonald’s Introduction to Commutative Algebra fits. It’s 125 pages long, and it’s possible to do all the exercises in 2-3 weeks – I did them over winter break in preparation for a course.
Lang’s Algebra and Eisenbud’s Commutative Algebra are both supersets of Atiyah & McDonald, I’ve studied each of those as well and thought A&M was significantly better.
Unfortunately, I think it isn’t very compatible with the way management works at most companies. Normally there’s pressure to get your tickets done quickly, which leaves less time for “refactor as you go”.
I’ve heard this a lot, but I’ve worked at 8 companies so far, and none of them have had this kind of time pressure. Is there a specific industry or location where this is more common?
A big piece is that companies are extremely siloed by default. It’s pretty easy for a team to improve things in their silo, it’s significantly harder to improve something that requires two teams, it’s nearly impossible to reach beyond that.
Uber is particularly siloed, they have a huge number of microservices with small teams, at least according to their engineering talks on youtube. Address validation is probably a separate service from anything related to maps, which in turn is separate from contacts.
Because of silos, companies have to make an extraordinary effort to actually end up with good UX. Apple was an example of these, where it was literally driven by the founder & CEO of the company. Tumblr was known for this as well. But from what I heard, Travis was more of a logistics person than a UX person, etc.
(I don’t think silos explain the bank validation issue)
Cooking:
Smelling ingredients & food is a good way to develop intuition about how things will taste when combined
Salt early is generally much better than salt late
Data Science:
Interactive environments like Jupyter notebooks are a huge productivity win, even with their disadvantages
Automatic code reloading makes Jupyter much more productive (e.g. autoreload for Python, or Revise for Julia)
Bootstrapping gives you fast, accurate statistics in a lot of areas without needing to be too precise about theory
Programming:
Do everything in a virtual environment or the equivalent for your language. Even if you use literally one environment on your machine, the tooling around these tends to be much better
Have some form of reasonably accurate, reasonably fast feedback loop(s). Types, tests, whatever – the best choice depends a lot on the problem domain. But the worst default is no feedback loop
Ping-pong:
People adapt to your style very rapidly, even within a single game. Learn 2-3 complementary styles and switch them up when somebody gets used to one
Friendship:
Set up easy, default ways to interact with your friends, such as getting weekly coffees, making it easy for them to visit, hosting board game nights etc.
Take notes on what your friends like
When your friends have persistent problems, take notes on what they’ve tried. When you hear something they haven’t tried, recommend it. This is both practical and the fact that you’ve customized it is generally appreciated
Conversations:
Realize that small amounts of awkwardness, silence etc. are generally not a problem. I was implicitly following a strategy that tried to absolutely minimize awkwardness for a long time, which was a bad idea
using vector syntax is much faster than loops in Python
To generalize this slightly, using Python to call C/C++ is generally much faster than pure Python. For example, built-in operations in Pandas tend to be pretty fast, while using
.apply()
is usually pretty slow.
I didn’t know about that, thanks!
I found Loop Hero much better with higher speed, which you can fix by modifying a
variables.ini
file: https://www.pcinvasion.com/loop-hero-speed-mod/
I’ve used
Optim.jl
for similar problems with good results, here’s an example: https://julianlsolvers.github.io/Optim.jl/stable/#user/minimization/
The general lesson is that “magic” interfaces which try to ‘do what I mean’ are nice to work with at the top-level, but it’s a lot easier to reason about composing primitives if they’re all super-strict.
100% agree. In general I usually aim to have a thin boundary layer that does validation and converts everything to nice types/data structures, and then a much stricter core of inner functionality. Part of the reason I chose to write about this example is because it’s very different from what I normally do.
Important caveat for the pass-through approach: if any of your
build_dataset()
functions accept**kwargs
, you have to be very careful about how they’re handled to preserve the property that “calling a function with unused arguments is an error”. It was a lot of work to clean this up in Matplotlib...To make the pass-through approach work, the
build_dataset
functions do accept excess parameters and throw them away. That’s definitely a cost. The easiest way to handle it is to have thebuild_dataset
functions themselves just pass the actually needed arguments to a stricter, core function, e.g.:def build_dataset(a, b, **kwargs): build_dataset_strict(a, b) build_dataset(**parameters) # Succeeds as long as keys named "a" and "b" are in parameters
This is a perfect example of the AWS Batch API ‘leaking’ into your code. The whole point of a compute resource pool is that you don’t have to think about how many jobs you create.
This is true. We’re using AWS Batch because it’s the best tool we could find for other jobs that actually do need hundreds/thousands of spot instances, and this particular job goes in the middle of those. If most of our jobs looked like this one, using Batch wouldn’t make sense.
You get language-level validation either way. The
assert
statements are superfluous in that sense. What they do add is in effectcheck_dataset_params()
, whose logic probably doesn’t belong in this file.You’re right. In the explicit example, it makes more sense to have that sort of logic at the call site.
The reason to be explicit is to be able to handle control flow.
The datasets aren’t dependent on each other, though some of them use the same input parameters.
If your jobs are independent, then they should be scheduled as such. This allows jobs to run in parallel.
Sure, there’s some benefit to breaking down jobs even further. There’s also overhead to spinning up workers. Each of these functions takes ~30s to run, so it ends up being more efficient to put them in one job instead of multiple.
Your errors would come out just as fast if you ran
check_dataset_params()
up front.So then you have to maintain
check_dataset_params
, which gives you a level of indirection. I don’t think this is likely to be much less error-prone.The benefit of the pass-through approach is that it uses language-level features to do the validation – you simply check whether the parameters dict has keywords for each argument the function is expecting.
A good way to increase feedback rate is to write better tests.
I agree in general, but I don’t think there are particularly good ways to test this without introducing indirection.
Failure in production should be the exception, not the norm.
The failure you’re talking about here is tripping a
try
clause. I agree that exceptions aren’t the best control flow – I would prefer if the pattern I’m talking about could be implemented with if statements – but it’s not really a major failure, and (unfortunately) a pretty common pattern in Python.
“refine definite theories”
Where does this quote come from – is it in the book?
Two Designs
Is there a reason you recommend Hy instead of Clojure? I would suggest Clojure to most people interested in Lisp these days, due to the overwhelmingly larger community, ecosystem, & existence of Clojurescript.
Ah, that’s a great example, thanks for spelling it out.
This is sometimes true in functional programming, but only if you’re careful.
I think this overstates the difficulty, referential transparency is the norm in functional programming, not something unusual.
For example, suppose the expression is a function call, and you change the function’s definition and restart your program. When that happens, you need to delete the out-of-date entries from the cache or your program will read an out-of-date answer.
As I understand, this system is mostly useful if you’re using it for almost every function. In that case, your inputs are hashes which contain the source code of the function that generated them, and therefore your caches will invalidate if an upstream function’s source code changed.
Also, since you’re using the text of an expression for the cache key, you should only use expressions that don’t refer to any local variables.
Agreed.
So this might be okay in simple cases when you are working alone and know what you’re doing, but it likely would result in confusion when working on a team.
I agree that it’s essentially a framework, and you’d need buy-in from a team in order to consistently use it in a repository. But I’ve seen teams buy into heavier frameworks pretty regularly; this version seems unusual but not particularly hard to use/understand. It’s worth noting that bad caching systems are pretty common in data science, so something like this is potentially a big improvement there.
This is very cool. The focus on caching a code block instead of just the inputs to the function makes it significantly more stable, since your cache will be automatically invalidated if you change the code in any way.
If you’re using non-modal editing, in that example you could press Alt+rightarrow three times, use cmd+f, the end key (and back one word), or cmd+righarrow (and back one word). That’s not even counting shortcuts specific to another IDE or editor. Why, in your mental model, does the non-modal version feel like fewer choices? I suspect it’s just familiarity – you’ve settled on some options you use the most, rather than trying to calculate the optimum fewest keystrokes each time.
Have you ever seen an experienced vim user? 3-5 seconds latency is completely unrealistic. It sounds to me like you’re describing the experience of being someone who’s a beginner at vim and spent half their life into non-modal editing, and in that case, of course you’re going to be much faster with the second. And to be fair, vim is extremely beginner-unfriendly in ways that are bad and could be fixed without harming experts – kakoune(https://kakoune.org/) is similar but vastly better designed for learning.
As a side note, this is my last post in this conversation. I feel like we have mostly been repeating the same points and going nowhere.
One sort-of counterexample would be The Unreasonable Effectiveness of Mathematics in the Natural Sciences, where a lot of Math has been surprisingly accurate even when the assumptions where violated.