Megan Hazen (
meganursula) wrote2008-09-11 01:22 pm
![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Entry tags:
i'm back to needing a work icon
Question for object oriented gurus:
I am currently reviewing some code implementing a current standard of an algorithm i frequently use. (I want to examine some modifications to the algorithm, but, i need a good baseline to compare to.) In it we see something like:
struct velocity
{
int size;
double v[D_max]
};
There are a lot of these structs - position, quantum, etc.
Thing is, in my code, i generally declare
num_dim = n; // this is what they are using size for up above
double position[num_dim];
double velocity[num_dim];
(quantum, for the record, appears to be taking the place of what i usually declare as a constant Eps, and is used to get around numerical issues when looking for zero.)
etc. I do not have additional structs. Thing is, i find all this structifying to be sort of pointless and irritating. Pointless because i do not know what the structs are adding to the code. Irritating because i think they add a level of obfuscation, rendering the code not only longer, but also much less readable.
My question - what, if anything, am i missing in this situation? I get, generally, what object oriented-ness does for you. But i haven't used it very much in the past 6 or so years. (Matlab's excuse for object oriented isn't worth bothering with.) Right now i find myself faced with a few examples of modern code that are object oriented up the ass, and it just seems like it all has been taken too far. If i give myself three months will i become a believer? Will i stop feeling like there should be some sort of natural progression through code and adapt to having objects interacting at will?
I am currently reviewing some code implementing a current standard of an algorithm i frequently use. (I want to examine some modifications to the algorithm, but, i need a good baseline to compare to.) In it we see something like:
struct velocity
{
int size;
double v[D_max]
};
There are a lot of these structs - position, quantum, etc.
Thing is, in my code, i generally declare
num_dim = n; // this is what they are using size for up above
double position[num_dim];
double velocity[num_dim];
(quantum, for the record, appears to be taking the place of what i usually declare as a constant Eps, and is used to get around numerical issues when looking for zero.)
etc. I do not have additional structs. Thing is, i find all this structifying to be sort of pointless and irritating. Pointless because i do not know what the structs are adding to the code. Irritating because i think they add a level of obfuscation, rendering the code not only longer, but also much less readable.
My question - what, if anything, am i missing in this situation? I get, generally, what object oriented-ness does for you. But i haven't used it very much in the past 6 or so years. (Matlab's excuse for object oriented isn't worth bothering with.) Right now i find myself faced with a few examples of modern code that are object oriented up the ass, and it just seems like it all has been taken too far. If i give myself three months will i become a believer? Will i stop feeling like there should be some sort of natural progression through code and adapt to having objects interacting at will?
no subject
Stroustrup, the guy who invented C++, foresaw objects being used primarily for very large chunks of code and data, not so much for little things like points.
You might make a position/velocity array, as a whole, be an object. That depends if there's much anything you would normally do to the whole thing together. Big vectors are often object-ized in other languages to give you easy syntax for things like dot products, cross products, et cetera. I wound up doing a lot with object-ized vectors while writing a lot of numerical integration and differentiation code, for instance. You save passing in two arrays and a size for every function call, instead passing in only an object pointer -- yay for less typing.
Then again, if you don't do much of that then you probably don't care. So yeah, depends on the algorithm, depends if you'll be doing more with the same sort of objects (position/velocity arrays), and so on. But your basic intuition (that a single velocity or position shouldn't be an object) is dead on for the sort of situations it sounds like you're discussing.
no subject
Things I like to use structures for:
- encapsulation: keep related data together, which makes it easier to pass it around or change the implementation. Sometimes you end up replicating data, like in your example. That rarely matters.
- type-checking. You're declaring a velocity, and that has a bunch of functions related to it that deal with velocities. If your function demands a int and a double*, you might screw up and pass in the wrong int or the wrong double*; if it demands a velocity*, you're not going to make that mistake.
- in C++, you can (and should) declare operator[] on array-like classes, so that you could then use
velocity v; v[2] = whatever;
and it would check the bounds of your array. You also get access to template libraries like STL and Boost, which have all sorts of extremely useful data structures like arrays that resize automatically, linked lists, balanced binary trees, hash tables, fixed-size arrays that check array bounds, etc.no subject
http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
Then, yank out the AlmostEqual2sComplement function, rename it, and use that in all your comparisons.
no subject
By "wrong" I mean usually, just saying that two values less than 1e-6 from each other are equal is nonsense. Sometimes, it's sensical, if you know what the units really are. But that's rare in my experience; usually we just make up 1e-6 and move it down to 1e-8 when that causes problems, and maybe that will cause problems too because now two numbers in the billions will test not-equal when they should be equal, etc.
no subject
I think that your answer seems to back this up, no?
I suppose part of the reason i'm curious to hear people's feed back is that i currently have three versions of what should be approximately the same algorithm - mine (in Matlab, about 20 lines worth), the web's (in C, about 21 pages worth, printed out!), and my co-worker's (in Python, and an intermediary length). I find the other two completely unreadable. But when you start complaining about EVERYONE ELSE, it seems likely that the problem is more likely to be YOURSELF.
One thing i find tricky is that there is no point where the code seems to follow a good path, (well, duh, i guess, it isn't procedural) that you can read, and you are instead jumping around to far-flung objects trying to figure out what snippets do. Is there a good way around that, or is it just a price that you pay?
no subject
Thanks for the code!
no subject
(Anonymous) 2008-09-11 09:29 pm (UTC)(link)no subject
no subject
no subject
Re: your last paragraph... To some extent, that's the price you pay. It bugs me too. Mostly, it's a different way of thinking. Some of OO's critics refer to it as The Kingdom of Nouns, which is probably a good way of thinking here -- procedural programming, as the name implies, is all about verbs, actions, *doing*. OO is all about objects, nouns, and data.
There's a good programming quote which goes something like, "show me your algorithms and I will check your tables to see how they work. Show me your tables and your algorithms will be unnecessary." OO is a sort of formalization of that idea. It's the idea that data is king, and functions are little adjunts to it. And if you flip that, you can see how procedural programming is the idea that actions are king, and data are little adjuncts to it. In a procedural program, you often have to search in various places to get a coherent picture of the overall data. In the OO mirror-world, you often have to search in various places to get a picture of the control flow, but the data is nicely situated to give you a coherent overall picture.
Functional and Aspect-Oriented programming, probably the next two big contenders after procedural and OO, organize on completely different principles, making it *both* difficult to get an overall picture of the data *and* difficult to get an overall picture of control flow :-)
no subject
no subject
no subject
no subject
There is actually support in python (at least with the right package) for stuff like that, AND my co-worker vectorized everyting which is less lines than my own loop system, and is code is still longer.
Reviewing the code, it seems like one reason that the C code is still long is that it is built to be a bit more flexible. And, frankly, having every curly brace on its own line adds pages. I don't think that explains everything.
no subject
Also, I don't think you mean higher-order function the way I usually mean it, but I can't quite figure out what you mean. Solving Ax=b is written "A \ b" (as in, "divide" A by b, but with a backslash because it's not really division). Maybe matrix multiply would have been a better example, but I forget what the matlab operator is for that.
no subject
So, the C code doesn't actually do anything to ensure that you don't overwrite an array. So they're not writing extra code to make a vector type. Thats not the reason.
And my point was that, yes, there is a lot of funcionality built into matlab thats not built into C, but, this code doesn't use any of it. The thing that it could be doing is replacing loops with matrix operations, but my matlab code doesn't do that. The python code does, and its still longer.
no subject
Since there are no cultural associations to organize around data or around code-flow (neither basically OO nor basically procedural), most people do neither. Thus, my comment about it being hard to trace both, which is what happens when you don't work to make it easy to do one or the other. It's entirely possible that OCaml, an unholy hybrid of functional (not pure-functional) and OO, manages to have those OO "Kingdom of Nouns" cultural expectations and organize its programs around data like regular OO programs. I've avoided OCaml, mainly because I'm neither a big OO fan nor at all an ML fan, and what few things I liked about ML would be basically destroyed by an OO type system and what it would do to type inference.
Yes, pure-functional (no modifications or side-effects allowed) does solve some problems with ordering, and especially with concurrency. When I said "functional" I meant more "allowing functions as first class objects and encouraging passing such objects, and large data structures, through function calls" rather than "prevents all side-effects." While the two are usually culturally bundled, they're technically orthogonal issues -- you could have a language as limited as C and still prevent side-effects, it's just that nobody's crazy enough to do that because avoiding side-effects is much harder than not, so you generally need the language to work harder to accomodate you while you're working out how to do things without all those intuitive imperative tricks like, "print this" or "assign this value to a variable" that are now verboten.
However, if you think of "functional" as meaning "has functions as first-class objects", it's a whole different set of assumptions and problems. Ruby, Python, SML/NJ, are all languages that allow first-class function objects without any prohibition on side-effects (functional, but not pure-functional). So they don't get the advantages you mention, but they still tend to organize code in wacky ways. When you're using a lot of map/reduce code (or its equivalent in other languages), you wind up wanting to create big sequences and pass them through a set of big sequence operations, which is what map and reduce are. It doesn't much matter how you organize your static chunks of data, and barely matters how you organize your data structures at that point -- the heart and soul of the program is going to be that very small number of nearly-impenetrable code that subtly handles your map/reduce stuff, along with some lambdas which will be defined nearby....
Which is fine, and works well for many people. Hell, my Ruby code looks exactly like that. And I try to organize it as procedural rather than OO, because if I'm going to make it that hard to follow control flow then I might as well organize to make it a *little* easier to follow control flow.
no subject
no subject
I've seen code that was just a lambda soup. It helped make me despise Ruby, although it wasn't the language's fault (plenty enough other things were). Most code I run into sees a function that takes a closure as basically a loop, which is perfectly readable.
no subject
Assuming PL == "programming language", you overgeneralize. That is one thing functional means. Much like "OO" can mean "using objects for polymorphism", and usually does, but may not. It can also mean "all types descended from a single parent type," but often does not.
Most code I run into sees a function that takes a closure as basically a loop, which is perfectly readable
Yup, that's actually how Ruby does most iteration.
And yeah, lambda soup is a pain in the ass. Ruby is interesting because it has little enough in common with most of its predecessor languages that its fans are really still figuring out how to write it, stylistically. So "Ruby style" is all over the map, and still evolving fairly rapidly for a language of its age.
no subject
no subject
My main beef with Ruby was that it didn't seem to offer much over python, except for tricky design issues that perl and python faced and overcame 5-10 years earlier. What it has come up with since I last got burned is a mature Ruby on Rails, which is the shiznit when it comes to writing web sites, or so I'm told. My main beef with both python and ruby was that it does almost no static checking, which means if you misspell a function name on a line of code that is only invoked after 30 minutes of computation, you want to kill something when those 30 minutes are up and you just lost the data.
As for PL, I meant the theoretical programming languages research community. Functional means no side-effects; first-order functions means lambdas; polymorphism usually means parametric polymorphism, not the god-awful OO inheritance stuff; first- and higher-order types are in vogue; any language in common industrial usage is hardly worth dignifying with the term "language." None of that is really in dispute in these halls (among the PL types -- others are very happy about how Ruby "doesn't get in the way" [of writing buggy code]). Once you drink the kool-aid, it's pretty hard to go back. Languages in industrial usage lag the research community by 20-30 years of course, though Simon Peyton-Jones at MSR Cambridge is pushing some pretty modern features into C#.
We're by now pretty far afield of the original topic aren't we...
no subject
Most of what I like about Ruby specifically is metaprogramming, which can be thought of as either poor-man's LISP macros or structured-eval-plus-runtime-type-definition. Rails happened in Ruby because Ruby has metaprogramming, which Rails uses quite extensively.
I like Python just fine, but its indents-are-syntactic quirk makes it very hard to embed as a templating language. That's not a huge drawback overall, but again, Rails needs a very robust templating language since a lot of what it does involves generating HTML from templates. And honestly, I just like Ruby syntax better for most stuff.
no subject
One thing I should probably have thought of, but wasn't thinking it since I don't usually write this kind of code: if you're basically trolling through dense matrices and vectors, then, yeah, making lots of structures is pretty silly (except for error handling, but your competition isn't doing that, which makes me weep). It matters a lot more when you have complex data structures all linked together by pointers -- which is what I usually deal with. And once I've gone down that bridge, somewhere I probably needed a vector, and so I want your dense linear algebra function to be able to take that vector type instead of forcing me to copy it into an array.
no subject
no subject
No, i don't think so. I think you're overestimating the complexity of the actual algorithm.
I suppose all these numbers are fuzzy, though, so maybe before i make people think i'm being too precise i'd have to go through the exercise and actually count the code lines (as opposed to comment lines).