This dope got lucky in 2012, essentially using “take the mean” and was hailed as a prophet. He was wrong about virtually everything, and if someone were to make a table of his predictions over time and calculate Brier scores, I’m pretty sure he’ll get a higher score than Magic-8 ball (Brier scores, lower is better). Prediction is difficult, as the sage said, especially regarding the future. Claiming you can do prediction when you can’t is irresponsible and can actually be dangerous.
While he richly deserves to find his proper station in life as an opinionated taxi driver, this clown is unfortunately likely to be with us for years to come, bringing shame on the profession of quantitative analysis of data. We’ll be watching, Nate.
Kerf is primarily designed as a tool for dealing with data streams. The most obvious streams of data which bring in customers are stock market data, but there are lots of interesting data streams in the world. Enough that people are starting to have a name for this “vertical” -aka “internet of things.” One of the “things of the internet” I’ve worked with in the past is smartmeter data. This is a stream of electrical power consumption data from individual power meters in homes, commercial sites, factories, electric car chargers and what have you. While everyone knows that markets create big streams of potentially interesting data, the data from power grids is also big and interesting. It’s big because a typical city might have millions of power meters in it, recording data at minute or quarter hour increments. It’s interesting because power companies need to know what is going on with their networks to provide uninterrupted electrical service that we all take for granted. It’s also very interesting to power companies who rely on “renewables” such as wind or solar which are not steady or controllable producers of power. If you have a power grid which relies on mostly renewables, you have all kinds of complicated operations research problems to deal with during times when it is cloudy, dark or the breezes are calm.
The data volume of “internet of things” applications can be unwieldy to manage in standard tools, but there are analytics challenges as well. A standard challenge for smartgrid stuff is doing forecasts. Such models can get really hairy. Every smart meter installation is quite different from every other smart meter. A power meter for a car charging station is vastly different from that of a school, an apartment or a house, and all of those things are different from one another. The “users” of each apartment and house have different schedules, appliances and devices. Getting this right in an automated way can pay big dividends; you can provide services to the people with the meter, and you can do more accurate forecasts on a city wide level if you understand the generating process at the individual meter level.
If you want to get something like this right in an automated fashion, it’s a fairly large project involving paying people like me to think about it for a few months. Doing all this in an automated, general and reliable way involves feature creation, feature selection algorithms, machine learning, data gathering, data cleansing, weather forecasting, wind forecasting, optimization and the kitchen sink. I’d love to eventually have the full suite of such tools to do such a project in Kerf as primitives, but we don’t know if customers would appreciate or use such things, so I’ll make do with some small steps for this blog.
My latest relevant nerdy article for Taki; a belly laugh at Google thinking they’re still innovative and cool the way they were 6 years ago: