Locklin on science

String Interning Done Right

Posted in Design, Kerf by Scott Locklin on February 22, 2016

Refresher

The common method for interning strings breaks in fantastic ways. In Kerf, we’ve taken the old method and revised it for success in the current generation of languages.

If you’ve forgotten what string interning is from your time with Java, it’s a simple way of ensuring that any string appears only once. So for instance, if I have ten objects whose type is the string “lion”, I don’t have ten copies of the string “lion” in memory. What I have instead is a single copy of “lion” and then ten links which all point to the single copy somehow.

Most often these links are pointers (raw addresses in memory). We’ll discuss how this breaks soon. The reference copy of “lion” is never allowed to be moved, changed, or released. It’s permanent. The other implementation detail to figure out is how to keep the strings unique. The next time we create a “lion” string we need to trade our string in for a link, and what is usually done is that the reference copy is stored in a hashtable or some other deduplicating data structure. This lets us figure out that lion already exists, and when we perform the check we can walk away with our link at the same time. If the phrase is “bullmation”, and it doesn’t already exist, then we can add the initial copy to the data structure at the same time.

…read the rest… http://getkerf.wordpress.com/2016/02/22/string-interning-done-right/

Advertisements

Tatra 603: wacky commie hot rod

Posted in Design by Scott Locklin on February 20, 2016

Every now and then I run into a piece of technology which I find completely mind boggling. Something that shouldn’t really exist, but does anyway. The Tatra 603 is one of these things.

For one thing, it’s a communist automobile from the former Czechoslovakia, released in 1949. You know; the communists -the people who brought us the Trabant and the Lada. First thing you notice is, unlike the Trabant or Lada, or even a Skoda, the Tatra is pretty.

Tatra-603-front-by-sfm

Looking under the hood, well, you’ll find … nothing, because it’s a rear engined car, like an old Porsche. Looking in the trunk, you find … an air cooled V-8 which is insane and amazing. The only air cooled cars most people ever see are Porsches. So basically what we have here is a 6-passenger Porsche with a rumbley motor in it.

1968_tatra_603_v8_air_cooled

Apparently it handled like a giant Porsche also. It was also hand-made like an old Porsche. It was only a 100 horsepower V-8, but it was also a light car with a stick shift. Sort of like one of the 1930s era Jaguar sedans, except with a rear engine and the power curve of a V-8, rather than a straight six.

This mind blowing 1962 communist ad for the Tatra 603 … well, gear heads have to promise to take 13 minutes of their lives  to watch this. First off; consider the fact that this was a car only allowed to high communist officials who got professional chauffeurs. I guess high communist officials just sat around all day and watched 13 minute long commercials about the glorious products of people’s Tatra factory. Second … I mean, look at the driving insanity. Road hogging, drifting … in a rear engined car, reckless (I’ve been on the very same roads; these guys are nuts) Steve McQueen style hot-dogging, off road mud-bogging, outrunning them silly Boss Hoggski policemen, hill climbing, driving on sidewalks, and doing doughnuts in Chesky Krumlov: they even rolled the damn car down a hill and drove away; just to show it could be done. What the hell, communist block leaders? Either these guys had more fun being communist officials than any other group of people in all of human history …. or I don’t know what to think. Either way, try to imagine any of this in an American car ad at any point in history. And then, remember this was communism; communism was never sold as a fun ideology; it was a grim and serious ideology covered in human blood. Just skip to the middle if you don’t have the same amount of free time as a high communist party official.

The vague resemblance to the VW bug is no coincidence. The 1930s Tatras were innovators in streamlined cars. The Tatra-77 was a direct ancestor, and the designer (Paul Jaray) was involved with Zeppelin design before he started fooling with cars. The aerodynamics of old Tatras were often better than modern cars, and the VW bug design was lifted directly from Tatra economy cars such as the V570 and the T97.

Tatra-603-2

The communists had only been running the country for a few years when this thing came out in 1956, so it’s really an old capitalist/Paul Jaray design that ended up being made by commies, but it’s pretty damn cool that they kept it going until 1976. Also, the commercial makes me want to study dialectical materialism, so I can have a chauffeur and decorous, refined bimbo to drive around like a maniac with. I’m presuming that everyone in the car was completely schnockered on pivo and slivovitz, and am just a bit disappointed they weren’t all smoking like chimneys through the whole adventure.

 

http://jalopnik.com/316038/bobash-road-tests-the-1965-tatra-603

 

Timestamps done right

Posted in Design, Kerf by Scott Locklin on January 19, 2016

(Crossposted to Kerf blog)

I’ve used a lot of tools meant for dealing with time series. Heck, I’ve written a few at this point. The most fundamental piece of dealing with timeseries is a timestamp type. Under the covers, a timestamp is just a number which can be indexed. Normal humans have a hard time dealing with a number that represents seconds of the epoch, or nanoseconds since whenever. Humans need to see things which look like the ISO format for timestamps.

Very few programming languages have timestamps as a native type. Some SQLs do, but SQL isn’t a very satisfactory programming language by itself. At some point you want to pull your data into something like R or Matlab and deal with your timestamps in an environment that you can do linear regressions in. Kerf is the exception.

Consider the case where you have a bunch of 5 minute power meter readings (say, from a factory) with timestamps. You’re probably storing your data in a database somewhere, because it won’t fit into memory in R. Every time you query your data for a useful chunk, you have to parse the stamps in the chunk into a useful type; timeDate in the case of R. Because the guys who wrote R didn’t think to include a useful timestamp data type, the DB package doesn’t know about timeDate (it is an add on package), and so each timestamp for each query has to be parsed. This seems trivial, but a machine learning gizmo I built was entirely performance bound by this process. Instead of parsing the timestamps once in an efficient way into the database, and passing the timestamp type around as if it were an int or a float, you end up parsing them every time you run the forecast, and in a fairly inefficient way. I don’t know of any programming languages other than Kerf which get this right. I mean, just try it in Java.

Kerf gets around this by integrating the database with the language.

Kerf also has elegant ways of dealing with timestamps within the language itself.

Consider a timestamp in R’s timeDate. R’s add-on packages timeDate + zoo or xts are my favorite way of doing such things in R, and it’s the one I know best, so this will be my comparison class.


 

require(timeDate) 
a=as.timeDate("2012-01-01")
GMT
[1] [2012-01-01]

 

In Kerf, we can just write the timestamp down


 

a:2012.01.01
  2012.01.01

 

A standard problem is figuring out what a date is relative to a given day. In R, you have to know that it’s basically storing seconds, so:


 

as.timeDate("2012-01-01") + 3600*24
GMT
[1] [2012-01-02]

 

Kerf, just tell it to add a day:


 

2012.01.01 + 1d
  2012.01.02

 

This gets uglier when you have to do something more complex. Imagine you have to add a month and a day. To do this in general in R is complex and involves writing functions.

In Kerf, this is easy:


 

2012.01.10 + 1m1d
  2012.02.02

 

Same story with hours, minutes and seconds


 

2012.01.01 + 1m1d + 1h15i17s
  2012.02.02T01:15:17.000

 

And if you have to find a bunch of times which are a month, day, hour and 15 minutes and 17 seconds away from the original date, you can do a little Kerf combinator magic:


 

b: 2012.01.01 + (1m1d + 1h15i17s) times mapright  range(10)
  [2012.01.01, 2012.02.02T01:15:17.000, 2012.03.03T02:30:34.000, 2012.04.04T03:45:51.000, 2012.05.05T05:01:08.000, 2012.06.06T06:16:25.000, 2012.07.07T07:31:42.000, 2012.08.08T08:46:59.000, 2012.09.09T10:02:16.000, 2012.10.10T11:17:33.000]

 

The mapright combinator runs the verb and noun to its right on the vector which is to the left. So you’re multiplying (1m1d + 1h15i17s) by range(10) (which is the usual [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] ), then adding it to 2012.01.01.

You can’t actually do this in a simple way in R. Since there is no convenient token to add a month, you have to generate a time sequence with monthly periods. The rest is considerably less satisfying as well, since you have to remember to add numbers. In my opinion, this is vastly harder to read and maintain than the Kerf line.


 

b=timeSequence(from=as.timeDate("2012-01-01"),length.out=10,by="month") + (3600*24 + 3600 + 15*60 + 17) *0:9
 [2012-01-01 00:00:00] [2012-02-02 01:15:17] [2012-03-03 02:30:34] [2012-04-04 03:45:51] [2012-05-05 05:01:08] [2012-06-06 06:16:25] [2012-07-07 07:31:42] [2012-08-08 08:46:59] [2012-09-09 10:02:16] [2012-10-10 11:17:33]

 

This represents a considerable achievement in language design; an APL which is easier to read than a commonly used programming language for data scientists. I am not tooting my own horn here, Kevin did it.

If I wanted to know what week or second these times occur at, I can subset the implied fields in a simple way in Kerf:


 

b['week']
  [1, 6, 10, 15, 19, 24, 28, 33, 37, 42]
b['second']
  [0, 17, 34, 51, 8, 25, 42, 59, 16, 33]

 

I think the way to do this in R is with the “.endpoints” function, but it doesn’t seem to do the right thing


 

sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04 LTS
other attached packages:
[1] xts_0.9-7         zoo_1.7-12        timeDate_3012.100

.endpoints(b, on="week")
 [1]  0  1  2  3  4  5  6  7  8  9 10
.endpoints(b, on="second")
 [1]  0  1  2  3  4  5  6  7  8  9 10

 

You can cast to a POSIXlt and get the second at least, but no week of year.


 

as.POSIXlt(b)$week
NULL
as.POSIXlt(b)$sec
 [1]  0 17 34 51  8 25 42 59 16 33

 

Maybe doing this using one of the other date classes, like as.Date…


 

weekGizmo<-function(x){ as.numeric(format(as.Date(time(x))+3,"%U")) }

 

Not exactly clear, but it does work. If you have ever done things with time in R, you will have had an experience like this. I’m already reaching for a few different kinds of date and time objects in R. There are probably a dozen kinds of timestamps in R which do different subsets of things, because whoever wrote them wasn’t happy with what was available at the time. One good one is better. That way when you have some complex problem, you don’t have to look at 10 different R manuals and add on packages to get your problem solved.

Here’s a more complex problem. Let’s say you had a million long timeseries with some odd periodicities and you want to find the values which occur at week 10, second 33 of any hour.


 

ts:{{pwr:rand(1000000,1.0),time:(2012.01.01 + (1h15i17s times mapright  range(1000000)))}}
timing(1)
select *,time['second'] as seconds,time['week'] as weeks from ts where time['second']=33 ,time['week'] =10

┌────────┬───────────────────────┬───────┬─────┐
│pwr     │time                   │seconds│weeks│
├────────┼───────────────────────┼───────┼─────┤
│0.963167│2012.03.01T01:40:33.000│     33│   10│
│0.667559│2012.03.04T04:57:33.000│     33│   10│
│0.584127│2013.03.06T05:06:33.000│     33│   10│
│0.349303│2013.03.09T08:23:33.000│     33│   10│
│0.397669│2014.03.05T01:58:33.000│     33│   10│
│0.850102│2014.03.08T05:15:33.000│     33│   10│
│0.733821│2015.03.03T22:50:33.000│     33│   10│
│0.179552│2015.03.07T02:07:33.000│     33│   10│
│       ⋮│                      ⋮│      ⋮│    ⋮│
└────────┴───────────────────────┴───────┴─────┘
    314 ms

 

In R, I’m not sure how to do this in an elegant way … you’d have to use a function that outputs the week of year then something like this (which, FWIIW, is fairly slow) function to do the query.


 

require(xts)
ts=xts(runif(1000000), as.timeDate("2012-01-01") + (3600 + 15*60 + 17) *0:999999)
weekGizmo<-function(x){ as.numeric(format(as.Date(time(x))+3,"%U")) }
queryGizmo <- function(x) { 
 wks= weekGizmo(time(ts))
 secs=as.POSIXlt(time(ts))$sec
 cbind(x,wks,secs)->newx
 newx[(wks==10) & (secs==33)]
}
system.time(queryGizmo(ts))
   user  system elapsed 
  4.215   0.035   4.254

 

The way R does timestamps isn’t terrible for a language designed in the 1980s, and the profusion of time classes is to be expected from a language that has been around that long. Still, it is 2016, and there is nothing appreciably better out there other than Kerf.

Lessons for future language authors:

(continues at official Kerf blog)

 

repl-header

Putin’s nuclear torpedo and Project Pluto

Posted in big machines by Scott Locklin on December 31, 2015

There was some wanking among the US  foreign policy wonkosphere about the  nuclear torpedo “accidentally” mentioned in a Russian news video.

Status6

The device described in the leak is a  megaton class long range nuclear torpedo. The idea is, if you build a big enough bomb and blow it off in coastal waters, it will create a 1000 foot high nuclear tidal wave that will physically wipe out coastal cities and Naval installations, as well as pollute them with radioactive fallout. If the Rooskies are working on such a thing, rather than trolling the twittering pustules in our foreign policy “elite,” it is certainly nothing new. Such a device was considered in the Soviet Union in the 1950s, and the original November class submarine design (the first non-US built nuclear sub) was designed around it. It was called the T-15 “land attack” torpedo.  Oddly this idea originated from America’s favorite Soviet dissident, Andrei Sakharov when thinking about delivery systems for his 100 megaton class devices. People forget that young Sakharov was kind of a dick. Mind you, the Soviet Navy sunk this idea, in part because it only had a range of 25 miles (meaning it was basically a suicide mission), but also, according to Sakharov’s autobiography, some grizzled old Admiral put it “we are Navy; we don’t make war on civilian populations…”

Notice the big hole in the front: that's where the torpedo went

Notice the big hole in the front: that’s where the original doomsday torpedo went

The gizmo shown in this recent Russian leak is  a modern incarnation of the T-15 land attack torpedo without the Project 627/November class submarine delivery system. Same 1.6 meter caliber, megaton class warhead and everything. The longer range  of 5000 miles versus the 25 of the T-15 could be considered an innovation, and is certainly possible, but it only has tactical implications. From a strategic point of view: they had the same idea  years ago, for roughly the same reasons. Fifties era Soviet nuclear weapons delivery systems were not as reliable as American ones. In the 50s it was because Soviet bombers of the era were junk (mostly copies of the B-29). If they’re building this now, it’s because they’re worried about US missile defense.

 

Various analysts have been speculating that the thing is wrapped in cobalt or something to make it more dirty, because the rooskie power point talks about area denial. While it’s entirely possible, these dopes posing as analysts have some weird ideas about what a nuclear weapon is, and what it does. Nobody seems to have noticed that there’s a nuclear reactor pushing the thing around; predumably one using liquid metal coolants like the Alfa class submarines. I’m pretty sure lighting off a nuke next to a nuclear reactor will make some nasty and long lived fallout. At 1 megaton, just the bomb casing and tamper makes a few hundred pounds of nasty long lived radioactive stuff. The physics package the Russians would  likely use (SS-18 Mod-6 rated at 20Mt, recently retired from deployment atop SS-18 satan missiles) is a fission-fusion-fission bomb, and inherently quite “dirty” since most of the energy is released from U-238. Worse still:  blowing up a 1-100 megaton device in coastal mud will  make lots of nasty fallout.  Sodium-24 (from the salt in the water) is deadly. Half life is around 15 hours, meaning it would be clear in a few days, but being around it for the time it is active …. Then there is sodium-22, which has a half life of two and a half years; nukes in the water make less of this than sodium-24, but, well, go look it up. There is all kinds of other stuff in soil and muck which makes for unpleasant fallout. There’s an interesting book (particularly the 1964 edition) called “The Effects of Nuclear Weapons” available on archive. Chapter 9 shows some of the fallout patterns you can expect from blowing something like this up. Or, you could use this calculator thing;  a 1Mt device makes a lethal fallout cloud over thousands of square kilometers.

november

 

The twittering pustules who pass for our foreign policy elite are horrified, just horrified that the rooskies would spook us with such a device.  As if this were somehow a morally inferior form of megadeath to lobbing a couple thousand half megaton nuclear missile warheads at your least favorite country. Apparently this is how civilized countries who do not possess enemies with a plurality of coastal cities exterminate their foes. I don’t understand such people. Nuclear war is bad in general, m’kay? Mass slaughter with a nuclear torpedo is not morally inferior to mass slaughter with an ICBM. More to the point, getting along with Russians is easy and vodka is cheaper and more effective than ABM (and doomsday torpedo) defenses. If we hired actual diplomats and people who study history, instead of narcissistic toadies and sinister neocon apparatchiks to labor in our foreign services … maybe the Russians wouldn’t troll us with giant nuke torpedoes.

Doomsday engineering is often stranger than any science fiction. The things they built back in the cold war were weird.  While the US never admitted to building any 100 megaton land torpedoes (probably because Russia doesn’t have as many important coastal cities as the US does), we certainly worked on some completely bonkers nuclear objects.

pluto3

Imagine  a locomotive sized cruise missile, powered by a nuclear ramjet, cruising at mach-3 at tree level. The cruise missile  showers the landscape with two dozen hydrogen bombs of the megaton class, or one big one in the 20 megaton class. When it is finished its job of raining electric death mushrooms all over the enemy, it cruises around farting deadly radioactive dust and flattening cities with the sheer power of the sonic boom… for months. In principle, such a device can go on practically forever. If I were to use such a contraption as a plot device, you’d probably think it was far fetched. Such a thing was almost built by the Vought corporation 50 years ago. Click on the link. The Vought corporation thought it was cool enough to brag about it on their website (please don’t take it down guys; anyway if you do, I’ll put it back up).

pluto1

65,000 lbs, 80 feet long, with the terrifying code name, SLAM (Supersonic, Low Altitude Missile), or … “project Pluto.” This thing was perilously close to being built. They tested the engines at full scale and full power at Jackass Flats, and the guidance system was good enough they used essentially the same thing in the Tomahawk cruise missile. The problem wasn’t technical  … but how to test it? The fact that it was an enormous nuclear ramjet made it inherently rather dangerous. Someone suggested flight testing it on a tether in the desert. That would have been quite a tether to hold a mach 3 locomotive in place. Fortunately, we had rocket scientists who built ICBMs that worked. Of course, having an ICBM class booster would have been necessary to make the thing work in the first place (nuclear ramjets don’t start working until they’re moving at a decent velocity), which makes you wonder why they ever thought this was a good idea. Probably because people who dream these things up are barking looneys. Not that I wouldn’t have worked on this project, given the chance.

engines

The ceramic matrix for the reactor was actually made by the  Coors Porcelain company. Yes, the same company that makes shitty  beer has been (and continues to be) an innovator in ceramics; and this originated from the founder’s needing good materials for beer bottles and inventing beer cans. According to Jalopnik, they used exhaust header paint ordered from hot rod magazine to protect some of the electronic components. Apparently when they lit the reactor off at full power for the first time, they got so shitfaced, the project director (Merkle; yes, nano-dude’s father) had vitamin B shots issued to the celebrants the following day. Yes, I would have worked on project SLAM: as far as I can tell, it was the most epic redneck project ever funded by the US government. Not that we should have built such a thing, but holy radioactive doomsday smoke, Batman, it would have been a fun job for a few years.

I wouldn’t blame the Russians if they wanted to build a giant nuclear  torpedo-codpiece when the US sends Russiophobic dipshits like Michael McFaul to represent us in  Russia (look at his twitter feed; it is completely bonkers). I certainly hope they don’t build such a thing. It would also be nice if the US would stop screwing around with crap like that as well. Pretty sure it’s a giant troll, but the T-15 and Project Pluto were not.

Interesting pdf on Project Pluto:

http://www.amug.us/downloads/Pluto-Phoenix%20Facility%20at%20the%20NTS.pdf

Edit add:fascinating Russian wikipedia page MichaelMoser123 posted to hacker news:

https://ru.wikipedia.org/wiki/%D0%A1%D1%82%D0%B0%D1%82%D1%83%D1%81-6