# Locklin on science

## The most beautiful rocket

Posted in big machines by Scott Locklin on November 14, 2013

Rocketry is a field which peaked in the 1960s, probably never to improve appreciably. The space shuttle? A flying brick. The attempted replacement for the space shuttle, the Ares is a piece of junk. They originally tried to base it on shuttle technology “to save money” -a lovely demonstration of the sunk cost fallacy, but inexorably, fundamental systems (like the upper stage engines) were  replaced by 1960s era technology. The thing was such a piece of junk, it would have been more aerodynamically stable if it were flying backwards. Most flying machines are supposed to have more drag at the rear end then at the front end; otherwise they want to tumble. It looks wrong, because it is wrong. You could tell it was going to fail just by looking at the mock ups.

What a pogo stick looking piece of junk

Rockets are interesting in their simplicity. In liquid fueled form, they’re at their most complex, but they’re still pretty simple. There are some pumps to pump the liquid fuel and oxidizers into the combustion chamber. Inside the combustion chamber are some nozzles to squirt the stuff around and mix it together. There may or may not be a sparkplug to light it on fire. Once it is on fire, the gas gets very hot and squirts out a nozzle. The whole thing is essentially tank + pump + nozzle to burn it all in. There are gimbals and steering jets to keep the thing on its path, but the real business of the rocket is pretty simple. The rocket moves forward by flinging things out the back end of the rocket really fast. There is nothing complicated about rockets; they’re all nozzles and pumps and pipes and tanks. Really, they’re just a complicated spray can.

Spray can

One of the amusing subtleties of rockets: ever wonder why the exhaust nozzles are there? Why not just have a big hole at the business end? You may think it is to prevent flames from licking up the rocket’s side or something like that, but the reality is more interesting. The angle of the nozzle is necessary to extract the maximum energy from the burning gasses. Part of the extracted energy comes from the gas expanding as it leaves the combustion chamber. This is captured by the nozzle. The expanding gas pushes on the nozzle, effectively. The shape of the nozzle is dictated by the heat capacity of the fuel used, and the pressure that the nozzle works at. So a nozzle designed to work at sea level looks quite different from one which is supposed to work in space, where there is no ambient pressure. The thermodynamics for this is quite cute; I had no idea until I read a book on rocket science (Rocket Propulsion Elements by Sutton; PDF link here). You can figure out optimal nozzle shapes using simple ideas like the ideal gas equation and some calculus.

My favorite rocket is the Titan II. The Titan II was the first military ICBM that was worth a tinkers damn. It carried a giant nuclear warhead, meant to wipe out large cities in one shot with a 9 megaton yield. It was the heaviest payload carried by an American ICBM, and as such, it was kept in service until 1987. As a NASA launch vehicle, it lasted over 40 years, and would still be used, but America is no longer an industrial power, so we no longer make the fuels for it in enough quantity to make it cost effective. Its fuel consists of a toxic brew of caustic, poisonous hydrazines and for an oxidizer, nitrogen tetroxide, which turns into nitric acid when exposed to water. Nasty stuff, but very practical nasty stuff compared to what most liquid fueled rockets use, since it is all liquid at room temperature. Cryogenics like liquid oxygen and liquid hydrogen (more common liquid rocket fuels) were more difficult to handle; and they take a long time to load into a rocket. For military applications this was important, as you want to be able to launch at a moments notice. These fuels were also hypergolic, meaning they light on fire when they touch each other: not having a spark plug means one less thing to fail. Astronauts loved these fuels also, as they meant for quick countdowns, rather than sitting on top of an explosive firecracker while fueling up with cryogenic fuels. I love these fuels because of their toxic insanity. The real chemistry of rockets is more insane than any science fiction.

What lights my jets about the Titan II; its symmetry. It doesn’t have a dozen engines at the business end: just two, and they look damn cool, like dual quads on a fast car. The ratio of diameter to over all length is 1:10, which is a ratio evocative of viking broadswords. Not squat and ugly like the Polaris or the early Soviet launchers. Nor does it have the ugly staggered cone shape of the Saturn-V. Nor did it lumber and loiter at the launchpad like so many launch vehicles; it shot into space with dispatch and purpose. Even the exhaust plume is more beautiful than other rockets; it’s orange, and the two engines make a dignified narrow column of fire as it hurtles up to space. It looks like a rocket should. It is a graceful design, and they didn’t ruin it by putting cowlings over the interesting looking combustion chambers and elegant exhaust nozzles. Even the second stage engines are partially exposed to the air, like the engine in a top fuel dragster. It looks like it means business. It looks like the type of thing which could flatten a city, or lob a Freemason or two into orbit. It sent men into space, and was a genocidal trump card in protecting the formerly Free World from communism. As a cold war aesthetic artifact it has few equals.

“The Titan took off quickly!”

“It also looked good in its roll over and second stage activation; like a real space ship; most rockets look ugly doing this”

“watch the whole thing from another angle”

## Shannon information, compression and psychic powers

Posted in information theory, J by Scott Locklin on October 31, 2013

Fun thing I found in the J-list, a game of code golf. Generate the following pattern with the least number of bytes of code:

_____*_____
_____*_____
____*_*____
___*___*___
__*_____*__
**_______**
__*_____*__
___*___*___
____*_*____
_____*_____
_____*_____

The  J solution by Marshall Lochbaum is the best one I’ve seen, taking only 18 bytes of code.

'_*'{~4=+/~ 4<.|i:5


I’ll break this down into components for the folks who don’t speak J, so you can see what’s going on:

i:5
NB. -5 to 5;


_5 _4 _3 _2 _1 0 1 2 3 4 5

|i:5
NB. make the list positive


5 4 3 2 1 0 1 2 3 4 5

4<.|i:5
NB. cap it at 4


4 4 3 2 1 0 1 2 3 4 4

+/~ 4<.|i:5
NB. ~ is magic which flips the result around and feeds it to the
NB. verb immediately to the left.
NB. the left hand +/ verb is the + operator applied / across the
NB. list


8 8 7 6 5 4 5 6 7 8 8
8 8 7 6 5 4 5 6 7 8 8
7 7 6 5 4 3 4 5 6 7 7
6 6 5 4 3 2 3 4 5 6 6
5 5 4 3 2 1 2 3 4 5 5
4 4 3 2 1 0 1 2 3 4 4
5 5 4 3 2 1 2 3 4 5 5
6 6 5 4 3 2 3 4 5 6 6
7 7 6 5 4 3 4 5 6 7 7
8 8 7 6 5 4 5 6 7 8 8
8 8 7 6 5 4 5 6 7 8 8

4=+/~ 4<.|i:5
NB. Which of these is equal to 4?


0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 1 0 0 0 0
0 0 0 1 0 0 0 1 0 0 0
0 0 1 0 0 0 0 0 1 0 0
1 1 0 0 0 0 0 0 0 1 1
0 0 1 0 0 0 0 0 1 0 0
0 0 0 1 0 0 0 1 0 0 0
0 0 0 0 1 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0

'_*'{~4=+/~ 4<.|i:5
NB. { is an array selector, ~ pops the previous result in 1's
NB. and 0's in front of  the string '_*', which creates the
NB. required array of characters and voila!


_____*_____
_____*_____
____*_*____
___*___*___
__*_____*__
**_______**
__*_____*__
___*___*___
____*_*____
_____*_____
_____*_____

I asked some facebook pals for a solution, and got some neat ones:

Python from Bram in 94 bytes:

for i in range(-5,6):print ''.join(['*_'[min(abs(i),4)+min(abs(j),4)!=4] for j in range(-5,6)])


A clever 100 byte Perl solution from Mischa (to be run with -Mbigint):

for$i(0..10){print($_%12==11?"\n":1<<$i*11+$_&41558682057254037406452518895026208?'*':'_')for 0..11}


Some more efficient solutions by Mischa in perl and assembler.

Do feel free to add clever solutions in the comments! I’d like to see a recursive solution in an ML variant, and I’m pretty sure this can be done in only a few C characters.

It’s interesting in that the display of characters is only 132 bytes (counting carriage returns, 121 without), yet it is hard in most languages to do this in under 100 bytes. The winner of the contest used a 73 byte shell script with gzipped contents attached. Pretty good idea and a praiseworthy hack. Of course, gzip assumes any old kind of byte might need to be packed, so it’s not the post efficient packing mechanism for this kind of data.

This gets me to an interesting point. What is the actual information content of the pattern? Claude Shannon told us this in 1948. The information content of a character in a string is approximately $\sum\limits_{i=1}^n {prob_i \log_2(\frac{1}{prob_i}})$

If you run this string through a gizmo for calculating Shannon entropy (I have one in J, but use the R infotheo package), you’ll find out that it is completely specified by 3 bits sent 11 times, for a total of 33 bits or 4.125 bytes. This sounds completely insane, but look at this coding for the star pattern:

000 _____*_____
000 _____*_____
001 ____*_*____
010 ___*___*___
011 __*_____*__
100 **_______**
011 __*_____*__
010 ___*___*___
001 ____*_*____
000 _____*_____
000 _____*_____

So, if you were to send this sucker on the wire, an optimal encoding gives 33 bits total. Kind of neat that the J solution is only 144 bits. It is tantalizing that there are only 5 distinct messages here, leaving room for 3 more messages. The Shannon information calculation notices this; if you do the calculation, it’s only 2.2 bits. But alas, you need some kind of wacky pants quantum computer that hasn’t been invented yet to send fractions of bits, so you’re stuck with a 3 bit code.

The amount of data that can be encoded in a bit is generally underestimated. Consider the 20 question game. Go click on the link. it’s a computer program that can “read your mind.” It’s an extremely clever realization of Shannon information for a practical purpose. The thing can guess what you’re thinking about by asking you 20 questions. You only give it 20 bits of information, and it knows what you’re thinking about. The reason this is possible is 20 bits is actually a huge amount of information. If you divide the universe of things a person can think of into 2^20 pieces, each piece is pretty small, and is probably close enough to something it seems like the machine can read your mind. This is also how “mentalism” and psychic powers work. Psychics and Mentalists get their bits by asking questions, and either processing the answers, or noticing the audience reaction when they say something (effectively getting a bit when they’re “warm”). Psychics and TV mentalists also only have to deal with a limited number of things people are worried about when they talk to psychics and TV mentalists.

Lots of machine learning algorithms work like optimal codes, mentalists and the 20 questions game. Decision trees, for example. Split the universe of learned things into a tree, which is effectively 1′s and 0′s (and of course, the split criterion at each node).

It’s little appreciated that compression algorithms are effectively a form of decision tree. In fact, they’re pretty good as decision trees, in particular for “online” learning, and sequence prediction. Look at a recent sequence, see if it’s like any past sequences. If it is a novel sequence, stick it in the right place in the compression tree. If it occurred before, you know what the probability of the next character in the string is by traversing the tree. Using such gizmos, you can generate artificial sequences of characters based on what is in the tree. If you encode “characters” as “words,” you can generate fake texts based on other texts. If you encode “characters” as “musical notes” you can generate fake Mozart based on Mozart. There are libraries out there which do this, but most folks don’t seem to know about them. I’ve noodled around with Ron Begleiter’s VMM code which does this (you can find some primitive R hooks I wrote here) to get a feel for it. There’s another R package called VLMC which is broadly similar, and another relative in the PST package.

One of the interesting properties of compression learning is, if you have a good compression algorithm, you have a good forecasting algorithm.  Compression is a universal predictor. You don’t need to do statistical tests to pick certain models; compression should always work, assuming the string isn’t pure noise. One of the main practical problems with them is picking a discretization (discretizations … can also be seen or used as a ML algorithm). If the data is already discrete, a compression algorithm which compresses well on that data will also forecast well on the data. Taking this a step further into crazytown, one can almost always think of a predictive algorithm as a form of compression. If you consider something like EWMA, you’re reducing a potentially very large historical time series to one or two numbers.

While the science isn’t well established yet, and it certainly isn’t widely known (though the gizmo which guesses what google search you’re going to do is a prefix tree compression learner), it looks like one could use compression algorithms  to do timeseries forecasting in the general case, hypothesis testing, decryption, classification, regression and all manner of interesting things involving data. I consider this idea to be one of the most exciting areas of machine learning research, and compression learners to be one of the most interesting ways of thinking about data. If I free up some time, I hope to write more on the subject, perhaps using Ron’s code for demonstrations.

Fun people to google on this subject:

Boris Ryabko & Son (featured in the ants thing)

Vladimir Vovk

Paul Algoet

Nicolò Cesa-Bianchi

Gabor Lugosi

The ubiquitous Thomas Cover

## Soviet interceptors: the power and the glory

Posted in big machines by Scott Locklin on July 30, 2013

The Soviets solved problems differently from the West. This wasn’t just because they had different problems to solve, though there is that. Part of the reason their artifacts are so weird to Western eyes is their engineering and scientific disciplines evolved under different pressures.

Consider the Soviet long range interceptors. Westerners mostly didn’t consider the interceptor to be an important combat role. The Soviet Union had vast territory, widely distributed population, and was surrounded by hostile neighbors with advanced long range bomber technology. Western nations were much smaller, with population centers less widely distributed. Western nations also only had to worry about attacks from one or two hostile nations, mostly far away. Soviets had to worry about attacks from Turkey, Japan, West Germany, France, China, Canada and Britain; they were surrounded. Soviet long range bombers were also not very effective compared to the Western ones. Soviet reconnaissance aircraft were also nowhere near as good as Western ones, and almost never left Soviet borders, where Western reconnaissance aircraft routinely penetrated Soviet borders. As such, point interceptors, such as the superb English Electric Lightning sufficed for defense against bombers for the West. There were a few aborted attempts at long range, high speed interceptors in the West; the Avro Arrow, the XF-108 Rapier, and proposed SR.187. Since no credible long-range, high speed Soviet bomber threat emerged, and missiles made this kind of attack of secondary concern, they didn’t bother going to production with these.

Tu-128UT, trainer model, informally known as, “the pelican”

The Soviets, though, they had serious problems requiring effective interceptors. They also had credible future threats, like the TSR-2 and the XB-70 Valkyrie. As such, they needed fast, long range interceptors. Interception was so important, the Soviets had a whole, independent air force, the PVO-Strany, whose job it was to do nothing other than defend the airspace of Mother Russia. This wasn’t a mere bureaucratic distinction like SAC or NORAD: the PVO-Strany had their own radar installations, their own schools, and their own chain of command. They even have their own holiday; April 11 if you care to toast their brave pilots with some wodka on that day. Because the bureaucracy which developed the interceptor forces was separate from other air forces commands, and was considered of the highest importance (the memories of Stalingrad were not very old), they had very specific priorities which needed to be met by aircraft design bureaus. In Western nations, it was generally accepted that aircraft designs should fulfill more general roles.

The early generations of these Soviet interceptors which had to fulfill these design goals are some of the most fascinating mechanical objects ever built. Their secrets were jealously guarded: they were never offered for export, and as a result, we still know little about them today. They were designed to intercept fast moving, high flying planes. They were not designed as dogfighters, unlike the Soviet general purpose fighters: they were meant to be controlled from a central command, move quickly, shoot their missiles, and return to base. Speed, climb rate, endurance and the ability to carry heavy, bomber destroying missiles were what was necessary. It was the supersonic aircraft equivalent of a top fuel dragster.

Early Su-15 Flagon-A

The Su-15 flagon was one of the most successful Soviet heavy interceptors, and exemplifies the idea the PVO-Strany was trying to bring to fruition. It was big: 64 feet long, and 38,000lbs; about the same size as the F-105 Thunderchief fighter-bomber. It was fast: Mach 2.5. It had long legs, with a 900 mile combat radius at 60,000 feet. It used two Mig-21 motors, and, due to its little delta wing had a preposterously hot take off and landing speed: as high as 280mph in the early pure delta model shown above.  It had a simple, but powerful radar system, designed to overwhelm ECM defenses by burning through them. Its arms were giant R-8 missiles, designed to take out large bombers. They also worked well on 747s. The Su-15 was effective against high flying targets, and almost 1300 of them patrolled the skies  from its 1967 deployment until the 90s. The Russians allegedly still keep some of these  mothballed in case they are needed in a time of crisis.

Later Su-15 Flagon-D. Note the cranked delta wing for slightly better low speed handling.

The Tu-128 was the heaviest fighter ever built, tipping the scales at a preposterous 88,000 pounds. By contrast, the B-58 Hustler only weighed 68,000 pounds, fully loaded with nukes. The size of the Tu-128 was no accident: it evolved from a failed supersonic bomber design. It used even more preposterously large missiles as its armament; the R-4. It wasn’t as common or successful as the Su-15, and it was slower and had a lower climb rate and ceiling, but it had its niche in Soviet air defense. Because of its enormous size it had  a longer range; 1600 miles. This made it particularly well suited to long patrol missions in the unpopulated North and East; it was able to loiter for hours at potential target areas. It wasn’t as dependent on ground control as the Su-15 was, as it had powerful radar of its own (legend has it, it would kill rabbits by the runway), and it was often used in concert with Soviet AWACS aircraft.

Prototype Tu-128 fiddler: the bomb-looking fairing is for test equipment

The Ye-150/2 series was never actually deployed, but it is my favorite of the Soviet heavy interceptors. The Mig-25 Foxbat was its eventual, much different looking offspring, also developed by the Mikoyan design bureau (they used “Ye” designation for prototypes, meaning yedinitsa or “single unit”). It was a sort of fat stainless steel version of a Mig-21; a metal cylinder with delta wings.  Early versions made it to Mach 2.9, and climbed to an absurd service ceiling of 76,000 feet. This was higher than the service ceiling of the U-2, and it was in testing before the Gary Powers incident. The service ceiling for the F-15 is only 60,000 feet. The Ye-150 could allegedly hit 50,000 feet in two and a half minutes. The English Electric lightning was considered one of the fastest climbers around, and it could only hit 40,000 feet in three minutes. Due to its stout tubular construction, there was also room for up to 15,000 pounds of fuel; the range was an impressive 1000 miles. The performance of this aircraft was off the hook; the fastest manned single engine jet which ever flew. And it flew in 1959. The A-12/SR-71 didn’t fly at all until 1962, and it took until 1963 to beat the performance of the Ye-152.

Dual engine Ye-152a

It’s worth explaining why the Ye-152 wasn’t developed into production. It was an amazing aircraft with a lot of potential, and the long careers of the Su-15 and Tu-128 indicated there was a need for such a beast. The problems were two fold. The early Tumansky R-15 engines it was designed around were flakey. While they eventually worked the bugs out and built the Mig-25 around two of these, they were not yet reliable in 1959. These engines were originally disposable cruise missile engines (later deployed as a very cool supersonic drone). Their early lifespan reflected this. While they eventually worked the bugs out of the engine, the early prototypes delivered to the Ye-152 project were pretty bad. So bad, some early versions were, like the Su-15, powered with two Mig-21 Tumansky R-11′s.

Ye-152-1: the wingtip mounted missiles were a failure, but this craft set speed and altitude records

The other reason for the project failure was more pedestrian. It was designed around the Urugan-5 weapons system. This was a Soviet system meant to do something like what the SAGE/F-106 system did. The giant pointy shock cone on the Ye-152 was a radome for the Urugan-5B or Almaz fire control radars. The system had an integrated data link from the aircraft to ground control. It is hard to say if Urugan-5 would have ended up as complex as the SAGE system, because the project was canceled, and the resources reallocated to missiles. They did develop the Vozdoohk-1 system a few years later, which accomplished similar things using the Su-15, Mig-25 and Tu-128.

Ye-152M; the last, and fastest of the lot; Mach 2.8+

The thing looked fierce: the pointy ram intake/radome evokes images of the spiked helmet of a Prussian solder. The tiny, recessed cockpit of some versions makes it appear like some sinister subterranean carnivore from the front view. Over all the thing looks like a supersonic medieval mace; all straight lines and brutally sharp angles. The designers didn’t even bother with the “area rule” -they didn’t need to; it’s powerful engine punched through supersonic drag issues like a hatchet through dog shit. Unlike the swoopy-doopy SR-71 or Bristol 188 or Tsybin RSR, it is an undiluted incarnation of terrifying speed and electric death.

Early Ye-150 model. A hot ship, despite engine problems: Mach 2.65 without full burners, and hitting 74,000 feet.

The PVO-Strany was probably ultimately focusing on the wrong problem. When the US moved away from high speed, high altitude bombers, and towards cruise missiles and insane rednecks flying bombing runs at tree-trimming altitudes, PVO-Strany did adapt, coming up with interceptors with “look down, shoot down” radar, and different mixtures of low altitude SAMs. The problem is, there is really not much wide-area defense one can do against low altitude attacks. This was proved dramatically when a wacky 18 year old Austrian with 50 hours of flying experience landed his Cessna in Red Square in 1987.

For all its failings, high altitude air defense was a noble idea, borne of the historical suffering of the Soviet Union. The devices they invented to defend Rodina Mat are some of the most astonishing objects ever built by the hands of men. They are the crystalization of the Promethean human spirit of the era; a vision of a glorious future of superhuman speed and the exploration of space. “Alas, burnished fighter … How that time has passed, Dark under night’s helm, as though it never had been!”

## Ruins of forgotten empires: APL languages

Posted in Design, J, Lush by Scott Locklin on July 28, 2013

One of the problems with modern computer technology: programmers don’t learn from the great masters. There is such a thing as a Beethoven or Mozart of software design. Modern programmers seem more familiar with Lady Gaga. It’s not just a matter of taste and an appreciation for genius. It’s a matter of forgetting important things.

talk to the hand that made APL

There is a reason I use “old” languages like J or Lush. It’s not a retro affectation; I save that for my suits. These languages are designed better than modern ones. There is some survivor bias here; nobody slings PL/1 or Cobol willingly, but modern language and package designers don’t seem to learn much from the masters. Modern code monkeys don’t even recognize mastery; mastery is measured in dollars or number of users, which is a poor substitute for distinguishing between what is good and what is dumb.  Lady Gaga made more money than Beethoven, but, like, so what?

Comparing, say, Kx systems Q/KDB (80s technology which still sells for upwards of \$100k a CPU, and is worth every penny) to Hive or Reddis is an exercise in high comedy. Q does what Hive does. It does what Reddis does. It does both, several other impressive things modern “big data” types haven’t thought of yet, and it does them better, using only a few pages of tight C code, and a few more pages of tight K code.

This man’s software is superior to yours

APL languages were developed a long time ago, when memory was tiny compared to the modern day, and disks much slower. They use memory wisely. Arrays are the basic data type, and most APL language primitives are designed to deal with arrays. Unlike the situation in many languages, APL arrays are just a tiny header specifying their rank and shape, and a big pool of memory. Figuring out what to do with the array happens when the verb/function reads the first couple of bytes of the header. No mess, no fuss, and no mucking about with pointless loops.

Code can be confusing if you don’t drink the APL kool-aide, but the concept of rank makes it very reusable. It also relegates idiotic looping constructs to the wastebin of history. How many more for() loops do you want to write in your lifetime? I, personally, would prefer to never write another one. Apply() is the right way for grown-assed men do things. Bonus: if you can write an apply(), you can often parallelize things. For(), you have to make too many assumptions.

Roger Hui, also constructed of awesomeness

One of the great tricks of the APL languages: using mmap instead of scanf. Imagine you have some big chunk of data. The dreary way most languages do things, you vacuum the data in with scanf, grab what is useful, and if you’re smart, throw away the useless bits. If you’re dealing with data which is bigger than core, you have to do some complex conga dance, splitting it up into manageable chunks, processing, writing it out somewhere, then vacuuming the result back in again. With mmap, you just point to the data you want. If it’s bigger than memory …. so what? You can get at it as quickly as the file system gets it to you. If it’s an array, you can run regressions on big data without changing any code. That’s how the bigmemory package in R works. Why wasn’t this built into native R from the start? Because programmers don’t learn from the masters. Thanks a lot, Bell Labs!

Fred Brooks, Larry Breed, Joey Tuttle, Arthur Whitney, Eugene McDonnell, Paul Berry: none of these men can be held responsible for inflicting the horrors of S+ on the world

This also makes timeseries databases simple. Mmap each column to a file; selects and joins are done along pointed indexes. Use a file for each column to save memory when you read the columns; usually you only need one or a couple of them. Most databases force you to read all the columns. When you get your data and close the files, the data image is still there. Fast, simple and with a little bit of socket work, infinitely scalable.  Sure, it’s not concurrent, and it’s not an RDBMS (though both can be added relatively simply). So what? Big data problems are almost all inherently columnar and non-concurrent; RDBMS and concurrency should be an afterthought when dealing with data which is actually big, and, frankly, in general. “Advanced” databases such as Amazon’s Redshift (which is pretty good shit for something which came out a few months ago) are only catching onto these 80s era ideas now.

Crap like Hive spends half its time reading the damn data in, using some godforsaken text format that is not a mmaped file. Hive wastefully writes intermediate files, and doesn’t use a column approach, forcing giant unnecessary disk reads. Hive also spends its time dealing with multithreaded locking horse shit. APL uses one thread per CPU, which is how sane people do things. Why have multiple threads tripping all over each other when a query is inherently one job? If you’re querying 1, 10 or 100 terabytes, do you really want to load new data into the schema while you’re doing this? No, you don’t. If you have new data streaming in, save it somewhere else, and do that save in its own CPU and process if it is important. Upload to the main store later, when you’re not querying the data. The way Q does it.

The APL family also has a near-perfect level of abstraction for data science. Function composition is trivial, and powerful paradigms and function modifications via adverbs are available to make code terse. You can afflict yourself with for loops if that makes you feel better, but the terse code will run faster. APL languages are also interactive and interpreted: mandatory for dealing with data. Because APL languages are designed to fit data problems, and because they were among the first interpreters, there is little overhead to slow them down. As a result, J or Q code is not only interactive: it’s also really damn fast.

It seems bizarre that all of this has been forgotten, except for a few old guys, deep pocketed quants, and historical spelunkers such as myself. People painfully recreate the past, and occasionally, agonizingly, come to solutions established 40 years ago. I suppose one of the reasons things might have happened this way is the old masters didn’t leave behind obvious clues, beyond, “here’s my code.” They left behind technical papers and software, but people often don’t understand the whys of the software until they run into similar problems.

Some of these guys are still around. You can actually have a conversation with mighty pioneers like Roger Hui, Allen Rose or Rohan J (maybe in the comments) if you are so inclined. They’re nice people, and they’re willing to show you the way. Data science types and programmers wanting to improve their craft and increase the power of their creations should examine the works of these masters. You’re going to learn more from studying a language such as J than you will studying the latest Hadoop design atrocity. I’m not the only one who thinks so; Wes McKinney of Pandas fame is studying J and Q for guidance on his latest invention. If you know J or Q, he might hire you. He’s not the only one. If “big data” lives up to its promise, you’re going to have a real edge knowing about the masters.

Start here for more information on the wonders of J.

http://conceptualorigami.blogspot.com/2010/12/vector-processing-languages-future-of.html