gjm comments on Bad names make you open the box

gjm 9 Jun 2021 14:25 UTC
6 points
If a function returns a value then in some sense it’s necessarily a get.
Things are more complicated when something both (1) does something and (2) returns a value. E.g., you might put something and then return something that indicates whether it worked or not; you might get something but the process of doing it might update a cache, having (if nothing else) an impact on performance of related future operations.
Some people advocate a principle of “command-query separation”: every operation is a “command” that might change the world (and doesn’t return anything) or a “query” that gives you some information (but doesn’t change anything) but nothing tries to do both at once. (If some commands can fail, you either use an exception-handling mechanism or have related queries for checking whether a command worked.)
That’s nice and clean but sometimes inconvenient; the standard example is a “pop” operation on a stack, which both tells you what’s on the top of the stack and removes it. (If it’s possible that there might be multiple concurrent things operating on the stack at once, you need either to have atomic operations like “pop” or else some explicit mechanism for claiming exclusive access to the stack while you look at its top element and then maybe remove it.)
In the present case, to me “getPromotedPosts” feels ambiguous between (1) “tell me which posts are promoted” and (2) “retrieve the promoted posts from somewhere”. If the function is just called “promotedPosts” then that makes it explicit that either it’s (1) or it’s (2) but the retrieval is an implementation detail you aren’t meant to care about, so I think I prefer “promotedPosts” unless there is a retrieval operation involved and it might be expensive or have side effects that matter.
What links here?
- Adam Zerner's comment on Bad names make you open the box by Adam Zerner (9 Jun 2021 17:58 UTC; 3 points)
- Ericf 9 Jun 2021 15:25 UTC
  4 points
  Parent
  I can see how the choice is architecture dependent. If you can write something like:
  
  Display(promotedPosts()) Display(recentPosts())
  
  having the function be written without a verb makes sense. If you have a multi-tier architecture where you want to cache things locally, the code might have to be: PostList = getPromotedPosts() Append(PostList, getRecentPosts()) ShowOnScreen(PostList)
  
  I would say the distinction is that if a function takes a long time to go look at a database and do some post-processing, we don’t want to run around using it like a variable. Especially if the database might change between one use of the data and the next, but we want to keep the results the same. That way, the code can be: PromotedPosts = getPromotedPosts() Display(PromotedPosts) …user clicks a button Email(PromotedPosts) //this sends the displayed posts, not whatever the promoted one happen to be at that moment
  - gjm 9 Jun 2021 16:03 UTC
    2 points
    Parent
    Yes, if it “takes a long time to go look at a database and do some post-processing”, that would be a case where (as I put it) “there is a retrieval operation involved and it might be expensive”, and then we might want a name that makes it easier to guess that it might be expensive.
- Adam Zerner 9 Jun 2021 17:17 UTC
  3 points
  Parent
  Thanks for the explanation here. I didn’t know the phrase “command-query separation”. It’s also helpful to be aware that “pop” is the standard example.
  
  In the present case, to me “getPromotedPosts” feels ambiguous between (1) “tell me which posts are promoted” and (2) “retrieve the promoted posts from somewhere”.
  
  I might be in the minority here, but something like promotedPosts feels too much like a variable. It feels awkward to me when the name of a function isn’t a verb.
  
  I agree about the ambiguity you point out, and for that reason I don’t feel good about the name getPromotedPosts. (Although you could establish a convention where the term “retrieve” or “fetch” is used for database access and “get” is used for situations like this.) I’m just not sure what would be better. I considered filterPromotedPosts, but that kinda sounds like it’s impure and is mutating the argument that’s passed in. Maybe filterPromotedPosts would be a good name if you’re working in a functional language though. It’s impossible to do such a mutation in a functional language, so the ambiguity goes away. I think that’s an interesting and often overlooked benefit of functional languages.
  - philh 13 Jun 2021 10:36 UTC
    2 points
    Parent
    The other thing about filterPromotedPosts is that it kind of sounds like the input is promoted posts and the output is some unspecified subset of them.filterPostsForPromoted avoids that but starts to feel unwieldy to me. (But maybe I should just be more okay with unwieldy names.)
    
    Even in an impure language I think filter sounds to me like it would return a new list rather than editing in place. That’s how the python filter function works for example, and Perl’s grep (which is basically a synonym for me), and I had to look this up but JavaScript’s filter too.
    - Adam Zerner 13 Jun 2021 17:54 UTC
      2 points
      Parent
      
      The other thing about filterPromotedPosts is that it kind of sounds like the input is promoted posts and the output is some unspecified subset of them. filterPostsForPromoted avoids that but starts to feel unwieldy to me. (But maybe I should just be more okay with unwieldy names.)
      
      I have the exact same feelings here. It’s funny how hard this is to name! Although these issues go away if you think about the name as only one part of the boxes label, and the signature + docstring as the others. Sorta. I think it’d still be nice if the name did as much of the job as possible by itself without having to consult the signature or docstring.
      
      Even in an impure language I think filter sounds to me like it would return a new list rather than editing in place.
      
      In my experience the ideas of functional programming are things that a lot of people just aren’t aware of at all. I know that for me it was about seven years into my journey as a programmer before I started learning about them. Thinking about the people I have and do work with, I could very well see them using filterPromotedPosts to mutate a list of posts. So in that environment, it seems like it’d be nice to make it extra clear that “this function isn’t actually mutating anything”. (Then again, I could also see them mutating stuff in getPromotedPosts too.)
      
      But in a different environment where the convention of “filter” being pure is strong enough, I agree with you. And I think that it’d often make sense to aspire towards this sort of environment. It’s interesting how much the right name depends on this sort of context.
- ChristianKl 9 Jun 2021 16:58 UTC
  2 points
  Parent
  To me getPromotedPosts() contains the idea that the function won’t run a neural model to decide which post should be promoted or load information from the internet but return to me data that’s already available in the program. On the other hand promotedPosts() feels unclear about that.
  I’m curious whether other people have the same intuition here.
  - gjm 9 Jun 2021 19:20 UTC
    2 points
    Parent
    My intuition says that
    if it’s called getPromotedPosts then it is probably fetching some information from somewhere—maybe the internet, maybe a database—and probably isn’t doing any computation to speak of;
    if it’s called promotedPosts then it is probably either computing something or just using a value it already knows and can return quickly and easily.
    I am not sure there’s any function name that would be perfectly neutral between (1) extremely cheap operation, probably just returning something already known, (2) nontrivial calculation, and (3) nontrivial fetching.
    There’s also a bit of ambiguity about whether something called getPromotedPosts is fetching the posts themselves or just cheap representations of them (e.g., ID numbers, pointers, etc.).
    So I might consider names like fetchPromotedPostIDsFromDatabase, retrievePromotedPostContent, inferPromotedPostsByModel, cachedPromotedPostList, etc. Or I might prefer a brief name like promotedPosts and put information about what it does and the likely performance implications in a comment, docstring, etc.