(I have a sense that the answer to this question is in the post but I’m having trouble extracting it out.)
There’s something that I think of as composition, and I’m not sure if this fits the definition of abstraction. Consider in the context of programming a User that has email and password properties. We think of User as an abstraction. It’s a thing that is composed of an email and password. I’m not seeing how this fits the definition of abstraction though. In particular, what information is being thrown away? What is the low level model, and what is the high level model?
The User example demonstrates composition of properties, but you could also have composition of instructions. For example, setPassword might consist of 1) saltPassword, 2) hashPassword and then 3) savePassword. Here we’d say that setPassword is an abstraction, but what is the information that is being thrown away?
Months ago, I had multiple drafts of posts formulating abstraction in terms of transformations on causal models. The two main transformations were:
Glom together some variables
Throw out information from a variable
What you’re calling composition is, I believe, the first one.
Eventually, I came to the view that it’s information throw-away specifically which really characterizes abstraction—it’s certainly the part where most of the interesting properties come from. But in the majority of use-cases, at least some degree of composition takes place as a sort of pre-processing step.
Looking at your specific examples:
User looks like it’s just a composition, although we could talk about it as a degenerate case of abstraction where all of the information is potentially relevant to things outside the User object itself, so we throw away nothing. That said, a lot of what we do with a User object actually does involve ignoring the information in its fields—e.g. we don’t think about emails and passwords when declaring a List<User> type or working with that list. So maybe there’s a case to be made that it’s an abstraction in the information-throw-away sense.
setPassword does throw away information: it throws away the individual steps. To an outside caller, it’s just a black-box function which sets the password; they don’t know anything about the internal steps.
So to the extent that these are both “throwing away information”, it’s in the sense that large chunks of our code treat them as black-boxes and don’t look at their internal details. When things do look at their internal details, those things are “close to” the objects, so the abstraction breaks down/leaks—e.g. if something tried to reconstruct the internal steps of setPassword, that would definitely be an example of leaky abstraction.
Hm, it seems to me that there is a distinction between 1) hiding information (or encapsulating it, or making it private), 2) ignoring it, and 3) getting rid of it all together.
For setPassword perhaps a programmer who uses this method can’t see the internals of what is actually happening (the salting, hashing and storing). They just call user.setPassword(form.password) and it does what they need it to do.
For User, in the example you give with List<User>, maybe we want to count how many users there are, and in doing so we don’t care about what properties users have. It could be email and password, or it could be username and dob, in the context of counting how many users there are you don’t care. However, the inner details aren’t actually hidden, you’re just choosing to ignore it.
For ideal gasses, we’re getting rid of the information about particles. It’s not that it’s hidden/private/encapsulated, it’s just not even there after we replace it with the summary statistics.
What do you think? Am I misunderstanding something?
And in the case that I am correct about the distinction, I wonder if it’s something worth pointing out.
Sounds like the distinction is about where/how we’re drawing the abstraction boundaries.
“Hiding information” suggests that there’s some object X with a boundary (i.e. a Markov blanket), and only the summary information is visible outside that boundary.
“Ignoring information” suggests that there’s some other object(s) Y with a boundary around them, and only the summary information about X is visible inside that boundary.
So basically we’re defining which variables are “far away” by exclusion in one case (i.e. “everything except blah is far away”) and inclusion in the other case (i.e. “only blah is far away”). I could definitely imagine the two having different algorithmic implications and different applications.
As for “getting rid of information”, I think that’s hiding information plus somehow eliminating our own ability to observe the hidden part. Again, I could definitely imagine that having additional algorithmic implications or applications. (Though this one feels weird for me to think about at all; I usually imagine everything from an external perspective where everything is always observable and immutable.)
(I have a sense that the answer to this question is in the post but I’m having trouble extracting it out.)
There’s something that I think of as composition, and I’m not sure if this fits the definition of abstraction. Consider in the context of programming a
User
that hasemail
andpassword
properties. We think ofUser
as an abstraction. It’s a thing that is composed of an email and password. I’m not seeing how this fits the definition of abstraction though. In particular, what information is being thrown away? What is the low level model, and what is the high level model?The
User
example demonstrates composition of properties, but you could also have composition of instructions. For example,setPassword
might consist of 1)saltPassword
, 2)hashPassword
and then 3)savePassword
. Here we’d say thatsetPassword
is an abstraction, but what is the information that is being thrown away?This is a great question.
Months ago, I had multiple drafts of posts formulating abstraction in terms of transformations on causal models. The two main transformations were:
Glom together some variables
Throw out information from a variable
What you’re calling composition is, I believe, the first one.
Eventually, I came to the view that it’s information throw-away specifically which really characterizes abstraction—it’s certainly the part where most of the interesting properties come from. But in the majority of use-cases, at least some degree of composition takes place as a sort of pre-processing step.
Looking at your specific examples:
User looks like it’s just a composition, although we could talk about it as a degenerate case of abstraction where all of the information is potentially relevant to things outside the User object itself, so we throw away nothing. That said, a lot of what we do with a User object actually does involve ignoring the information in its fields—e.g. we don’t think about emails and passwords when declaring a List<User> type or working with that list. So maybe there’s a case to be made that it’s an abstraction in the information-throw-away sense.
setPassword does throw away information: it throws away the individual steps. To an outside caller, it’s just a black-box function which sets the password; they don’t know anything about the internal steps.
So to the extent that these are both “throwing away information”, it’s in the sense that large chunks of our code treat them as black-boxes and don’t look at their internal details. When things do look at their internal details, those things are “close to” the objects, so the abstraction breaks down/leaks—e.g. if something tried to reconstruct the internal steps of setPassword, that would definitely be an example of leaky abstraction.
Hm, it seems to me that there is a distinction between 1) hiding information (or encapsulating it, or making it private), 2) ignoring it, and 3) getting rid of it all together.
For
setPassword
perhaps a programmer who uses this method can’t see the internals of what is actually happening (the salting, hashing and storing). They just calluser.setPassword(form.password)
and it does what they need it to do.For
User
, in the example you give withList<User>
, maybe we want to count how many users there are, and in doing so we don’t care about what properties users have. It could beemail
andpassword
, or it could beusername
anddob
, in the context of counting how many users there are you don’t care. However, the inner details aren’t actually hidden, you’re just choosing to ignore it.For ideal gasses, we’re getting rid of the information about particles. It’s not that it’s hidden/private/encapsulated, it’s just not even there after we replace it with the summary statistics.
What do you think? Am I misunderstanding something?
And in the case that I am correct about the distinction, I wonder if it’s something worth pointing out.
Sounds like the distinction is about where/how we’re drawing the abstraction boundaries.
“Hiding information” suggests that there’s some object X with a boundary (i.e. a Markov blanket), and only the summary information is visible outside that boundary.
“Ignoring information” suggests that there’s some other object(s) Y with a boundary around them, and only the summary information about X is visible inside that boundary.
So basically we’re defining which variables are “far away” by exclusion in one case (i.e. “everything except blah is far away”) and inclusion in the other case (i.e. “only blah is far away”). I could definitely imagine the two having different algorithmic implications and different applications.
As for “getting rid of information”, I think that’s hiding information plus somehow eliminating our own ability to observe the hidden part. Again, I could definitely imagine that having additional algorithmic implications or applications. (Though this one feels weird for me to think about at all; I usually imagine everything from an external perspective where everything is always observable and immutable.)
Yeah I think your descriptions match what I was getting at.