Visualizing the Ongoing California Drought

Wanting to know how the ongoing North American Drough affects California, I was looking for a very simple chart: one showing the evolution of the storage level of the major water reservoirs as a percentage of their capacity.

The website of the California Department of Water Resources has a map showing the status of the 12 major reservoirs, but you will not find an aggregate graph.

So I made one.

I scraped the monthly storage level of 11 of the 12 major reservoirs [1], for the last 40 years in order to include data from the California Drought of 1976 and 1977. For each month, I calculated the aggregate storage level (sum of the water level of each reservoir, divided by the sum of their capacity). The result is below:

We can clearly see the yearly wet/dry seasons. As of September 2014, the reservoirs are at 27.6% capacity in aggregate. By this metric, this is the worse in 37 years since the 1976-1977 drought where their aggregate storage level dropped to 14.6% in October 1977 (California had much worse water management policies at this time - you can read a fascinating 108-page report about it).

From looking at the graph, one can guess that if the upcoming winter does not bring abundant rainfalls or snowfalls, then the drought is going to be as bad as, or worse than the 1976-77 drought by the end of the summer of 2015. This is rather scary.

[1] For some reason the Exchequer reservoir (EXC) does not have a monthly data feed —only daily/hourly and I was too lazy to adjust my scripts for this peculiarity.

mrb | Tuesday 14 October 2014 at 10:00 pm | | Default | No comments

CDN53

Proof-of-concept for a super-distributed CDN storing data in DNS records

I wrote a Chrome extension that uses DNS instead of HTTP to fetch web content. It implements the fake TLD .cdn53: when visiting http://zorinaq.com.cdn53 the extension intercepts the request, sends a DNS query for the TXT record for "_cdn53.zorinaq.com" and the response contains the HTML content, as simple as that.

It also works for URLs containing a path (http://zorinaq.com.cdn53/foo/bar). It works with relatively large resources (each TXT record can contain up to ~65 kB). It works for any content including binary data: JPEG, GIF, JavaScript, CSS, etc.

A system like CDN53 has quite a few advantages over HTTP:

Read More

mrb | Thursday 02 October 2014 at 11:40 pm | | Default | Two comments
Used tags: , , , ,

A Decentralized API to Personal Information

"This blew my mind. Why the f*ck isn't this being done yet?"Reddit comment

An idea hit me: email is such a pervasive tool used by so many people and supported by so many software stacks that it is the ultimate vendor-neutral platform to build programmatic services between applications and persons: automatic exchange of contact information, PGP key, Bitcoin address, automating two-factor authentication, and so much more!

Here is an example: imagine you sign up on a web forum as bob@example.com. The site could send you a "special hidden email" to request your preferred avatar image, nickname, language, and timezone. And your mailbox automatically replies (like a vacation autoresponder) with the information formatted in a specific way, so that there is no need to re-enter the same information over and over on every web forum! (Like Gravatar but decentralized.) This would happen completely transparently, behind the scenes, without you even seeing the automatic email exchange.

Another example: encrypted email has failed to see wide adoption. Why? Because none of the setup steps are automated, so its use is cumbersome. First you have to ask the recipient if he even knows what PGP is, then if he wants to use encrypted email or not, and where his key can be obtained (key servers help but not always.) Instead imagine if the first time you composed an email to a new friend and hit "send", the email application would first send a "special hidden email" to your friend asking for his PGP key. Your friend's mailbox automatically replies "here is my key" (or "no key was configured"). And your mailbox receives the attached key, uses it to encrypt the email you composed, and saves it for future needs! With email encryption being negotiated automatically like that, its use would be more widespread.

This concept is what I call programmable email: email requests sent and replied to, automatically, in order to exchange personal information in a well-defined format without relying on a central database. In other words, this is a decentralized application programming interface (API) to personal information.

Read More

mrb | Sunday 13 July 2014 at 4:27 pm | | Default | Eight comments

1 Bitcoin = 1000 US dollars

In the last hour, around November 19 00:30 UTC, the value of a single Bitcoin on the world's largest Bitcoin exchange, BTCChina, rose to 1,000 US dollars (USD), or 6092.00 Chinese yuan (CNY).

Let me repeat this: 1 bitcoin is worth $1,000.

As I write these words, the exchange rate is about ¥6900 or $1100 and continues to increase. Other exchanges are a bit behind ($900 for MtGox and $750 on Coinbase), but arbitrage is taking place, so they should reach that level within the next few hours, assuming no crash.

I have been telling people since 2010 that Bitcoin is a revolutionary technology: the world's first decentralized, censorship-resistant, inflation-resistant, digital currency. With an overwhelming positive tone from today's first US senate hearing on virtual currencies and Bitcoin in particular, Bitcoin's future has never before been so promising.

I personally value a system like Bitcoin to at the very least the size of the remittance market (Western Union, MoneyGram, etc), or the size of the gold market. And possibly a lot more than that. This lower bound sets the worth of Bitcoin to very roughly $100 to $1,000 billion. With a maximum theoretical limit of 21 million bitcoins, this sets the worth of 1 bitcoin to $5,000 to $50,000. Bitcoin will remain very volatile in the near future. Sure a crash that would bring it back down to below $1,000 is possible or likely, but it would not change its long-term prospect, a few years from now, from being valued between $5,000 and $50,000 and possibly more.

mrb | Monday 18 November 2013 at 4:57 pm | | Default | Two comments
Used tags:

"I Contribute to the Windows Kernel. We Are Slower Than Other Operating Systems. Here Is Why."

I was explaining on Hacker News why Windows fell behind Linux in terms of operating system kernel performance and innovation. And out of nowhere an anonymous Microsoft developer who contributes to the Windows NT kernel wrote a fantastic and honest response acknowledging this problem and explaining its cause. His post has been deleted! Why the censorship? I am reposting it here. This is too insightful to be lost. [Edit: The anonymous poster himself deleted his post as he thought it was too cruel and did not help make his point, which is about the social dynamics of spontaneous contribution. However he let me know he does not mind the repost at the condition I redact the SHA1 hash info, which I did.] [Edit: A second statement, apologetic, has been made by the anonymous person. See update at the bottom.]

"""

I'm a developer in Windows and contribute to the NT kernel. (Proof: the SHA1 hash of revision #102 of [Edit: filename redacted] is [Edit: hash redacted].) I'm posting through Tor for obvious reasons.

Windows is indeed slower than other operating systems in many scenarios, and the gap is worsening. The cause of the problem is social. There's almost none of the improvement for its own sake, for the sake of glory, that you see in the Linux world.

Granted, occasionally one sees naive people try to make things better. These people almost always fail. We can and do improve performance for specific scenarios that people with the ability to allocate resources believe impact business goals, but this work is Sisyphean. There's no formal or informal program of systemic performance improvement. We started caring about security because pre-SP3 Windows XP was an existential threat to the business. Our low performance is not an existential threat to the business.

See, component owners are generally openly hostile to outside patches: if you're a dev, accepting an outside patch makes your lead angry (due to the need to maintain this patch and to justify in in shiproom the unplanned design change), makes test angry (because test is on the hook for making sure the change doesn't break anything, and you just made work for them), and PM is angry (due to the schedule implications of code churn). There's just no incentive to accept changes from outside your own team. You can always find a reason to say "no", and you have very little incentive to say "yes".

There's also little incentive to create changes in the first place. On linux-kernel, if you improve the performance of directory traversal by a consistent 5%, you're praised and thanked. Here, if you do that and you're not on the object manager team, then even if you do get your code past the Ob owners and into the tree, your own management doesn't care. Yes, making a massive improvement will get you noticed by senior people and could be a boon for your career, but the improvement has to be very large to attract that kind of attention. Incremental improvements just annoy people and are, at best, neutral for your career. If you're unlucky and you tell your lead about how you improved performance of some other component on the system, he'll just ask you whether you can accelerate your bug glide.

Is it any wonder that people stop trying to do unplanned work after a little while?

Another reason for the quality gap is that that we've been having trouble keeping talented people. Google and other large Seattle-area companies keep poaching our best, most experienced developers, and we hire youths straight from college to replace them. You find SDEs and SDE IIs maintaining hugely import systems. These developers mean well and are usually adequately intelligent, but they don't understand why certain decisions were made, don't have a thorough understanding of the intricate details of how their systems work, and most importantly, don't want to change anything that already works.

These junior developers also have a tendency to make improvements to the system by implementing brand-new features instead of improving old ones. Look at recent Microsoft releases: we don't fix old features, but accrete new ones. New features help much more at review time than improvements to old ones.

(That's literally the explanation for PowerShell. Many of us wanted to improve cmd.exe, but couldn't.)

More examples:

  • We can't touch named pipes. Let's add %INTERNAL_NOTIFICATION_SYSTEM%! And let's make it inconsistent with virtually every other named NT primitive.
  • We can't expose %INTERNAL_NOTIFICATION_SYSTEM% to the rest of the world because we don't want to fill out paperwork and we're not losing sales because we only have 1990s-era Win32 APIs available publicly.
  • We can't touch DCOM. So we create another %C#_REMOTING_FLAVOR_OF_THE_WEEK%!
  • XNA. Need I say more?
  • Why would anyone need an archive format that supports files larger than 2GB?
  • Let's support symbolic links, but make sure that nobody can use them so we don't get blamed for security vulnerabilities (Great! Now we get to look sage and responsible!)
  • We can't touch Source Depot, so let's hack together SDX!
  • We can't touch SDX, so let's pretend for four releases that we're moving to TFS while not actually changing anything!
  • Oh god, the NTFS code is a purple opium-fueled Victorian horror novel that uses global recursive locks and SEH for flow control. Let's write ReFs instead. (And hey, let's start by copying and pasting the NTFS source code and removing half the features! Then let's add checksums, because checksums are cool, right, and now with checksums we're just as good as ZFS? Right? And who needs quotas anyway?)
  • We just can't be fucked to implement C11 support, and variadic templates were just too hard to implement in a year. (But ohmygosh we turned "^" into a reference-counted pointer operator. Oh, and what's a reference cycle?)

Look: Microsoft still has some old-fashioned hardcore talented developers who can code circles around brogrammers down in the valley. These people have a keen appreciation of the complexities of operating system development and an eye for good, clean design. The NT kernel is still much better than Linux in some ways --- you guys be trippin' with your overcommit-by-default MM nonsense --- but our good people keep retiring or moving to other large technology companies, and there are few new people achieving the level of technical virtuosity needed to replace the people who leave. We fill headcount with nine-to-five-with-kids types, desperate-to-please H1Bs, and Google rejects. We occasionally get good people anyway, as if by mistake, but not enough. Is it any wonder we're falling behind? The rot has already set in.

"""

Edit: This anonymous poster contacted me, still anonymously, to make a second statement, worried by the attention his words are getting:

"""

All this has gotten out of control. I was much too harsh, and I didn't intend this as some kind of massive exposé. This is just grumbling. I didn't appreciate the appetite people outside Microsoft have for Kremlinology. I should have thought through my post much more thoroughly. I want to apologize for presenting a misleading impression of what it's like on the inside.

First, I want to clarify that much of what I wrote is tongue-in-cheek and over the top --- NTFS does use SEH internally, but the filesystem is very solid and well tested. The people who maintain it are some of the most talented and experienced I know. (Granted, I think they maintain ugly code, but ugly code can back good, reliable components, and ugliness is inherently subjective.) The same goes for our other core components. Yes, there are some components that I feel could benefit from more experienced maintenance, but we're not talking about letting monkeys run the place. (Besides: you guys have systemd, which if I'm going to treat it the same way I treated NTFS, is an all-devouring octopus monster about crawl out of the sea and eat Tokyo and spit it out as a giant binary logfile.)

In particular, I don't have special insider numbers on poaching, and what I wrote is a subjective assessment written from a very limited point of view --- I watched some very dear friends leave and I haven't been impressed with new hires, but I am *not* HR. I don't have global facts and figures. I may very well be wrong on overall personnel flow rates, and I shouldn't have made the comment I did: I stated it with far more authority than my information merits.

Windows and Microsoft still have plenty of technical talent. We do not ship code that someone doesn't maintain and understand, even if it takes a little while for new people to ramp up sometimes. While I have read and write access to the Windows source and commit to it once in a while, so do tens and tens of thousands of other people all over the world. I am nobody special. I am not Deep Throat. I'm not even Steve Yegge. I'm not the Windows equivalent of Ingo Molnar. While I personally think the default restrictions placed on symlinks limited their usefulness, there *was* a reasoned engineering analysis --- it wasn't one guy with an ulterior motive trying to avoid a bad review score. In fact, that practically never happens, at least consciously. We almost never make decisions individually, and while I maintain that social dynamics discourage risk-taking and spontaneous individual collaboration, I want to stress that we are not insane and we are not dysfunctional. The social forces I mentioned act as a drag on innovation, and I think we should do something about the aspects of our culture that I highlighted, but we're far from crippled. The negative effects are more like those incurred by mounting an unnecessary spoiler on a car than tearing out the engine block. What's indisputable fact is that our engineering division regularly runs and releases dependable, useful software that runs all over the world. No matter what you think of the Windows 8 UI, the system underneath is rock-solid, as was Windows 7, and I'm proud of having been a small part of this entire process.

I also want to apologize for what I said about devdiv. Look: I might disagree with the priorities of our compiler team, and I might be mystified by why certain C++ features took longer to implement for us than for the competition, but seriously good people work on the compiler. Of course they know what reference cycles are. We're one of the only organizations on earth that's built an impressive optimizing compiler from scratch, for crap's sake.

Last, I'm here because I've met good people and feel like I'm part of something special. I wouldn't be here if I thought Windows was an engineering nightmare. Everyone has problems, but people outside the company seem to infuse ours with special significance. I don't get that. In any case, I feel like my first post does wrong by people who are very dedicated and who work quite hard. They don't deserve the broad and ugly brush I used to paint them.

P.S. I have no problem with family people, and want to retract the offhand comment I made about them. I work with many awesome colleagues who happen to have children at home. What I really meant to say is that I don't like people who see what we do as more of a job than a passion, and it feels like we have a lot of these people these days. Maybe everyone does, though, or maybe I'm just completely wrong.

"""

mrb | Friday 10 May 2013 at 9:14 pm | | Default | 142 comments
Used tags: ,

ASIC Development Costs are Lower Than You Think

The minimal development costs for an ASIC are much lower than people think. These costs go toward:

  1. Designing the ASIC; the output is typically a GDSII file (a one-time cost: engineers salary + software licenses).
  2. Producing the mask set at the foundry (a one-time cost).
  3. Producing the wafers at the foundry (typically a fixed wafer lot cost + a cost per wafer).
  4. Slicing the wafer, and packaging the chips (fixed cost per wafer or per chip).

The cost of #1, designing the ASIC, is the most variable. It can be literally a few thousand dollars for a simple ASIC (eg. a Bitcoin mining ASIC with simple repetitive SHA-256 units), up to billions of USD (eg. the Cell processor development which cost 2 billion USD). In the case of a simple ASIC, the cost of #2, the mask set, is the biggest. #1 and #2 are usually refered to as NRE (non-recurring engineering) cost.

So what is the total cost for designing and producing a small initial batch of ASIC (sum of costs #1 through #4)? Let's hear from real-world sources who did produce Bitcoin mining ASICs:

  • The Bitfountain company said: in China, the cost is ~150k USD for 130nm, and ~500k USD for 65nm. They ended up developing a 130nm ASIC.
  • The BitSyncom company, also in China, posted enough information to reveal that their cost at the TSMC foundry were 200-300k USD for their 110nm ASIC. This matches the amount they raised via crowd funding (300 units * $1200 each = $360k).

So, there you have it, from two horses' mouths: ~150k USD for 130nm, 200-300k USD for 110nm, and ~500k USD for 65nm, as of 2013.

mrb | Friday 19 April 2013 at 10:25 pm | | Default | Six comments
Used tags: ,

Bitcoin Hits All-Time Trading High — This Time With Increased Economic Activity

Bitcoin just hit its all-time high exchange rate, three hours ago, breaking 32 USD/BTC on MtGox, the most popular Bitcoin exchange.

The last time it reached 32 USD/BTC was one year and eight months ago, on June 8, 2011. Back then it was very clearly a speculative valuation bubble. It had risen from 0.6 USD to 32 USD in merely two months (+5000%)! A few days later, the exchange rate fell and stabilized around 14-18 USD. Then a well-publicized hack of MtGox forced its closure for an entire week (June 20 to 26). When it re-opened, contrary to popular belief, there was no panick sell-off: the exchange rate remained around 14-18 USD for a good month. Then from August through November, it resumed its fall down to a low of 2 USD in November, which was a healthy and unavoidable correction of the speculative bubble. Of course, commentators who did not know better shouted "Bitcoin is dead!".

This time, things are different. Firstly, the increase in value has been relatively slower and steadier: +140% in the last two months, which is still a lot, but much less dramatic than the previous +5000% (nonetheless I still think there is at least a small speculative aspect). Secondly, and most importantly, this time Bitcoin has genuinely experienced a corresponding increased economic activity. Silicon Valley startups have been founded around the currency, many small businesses have adopted it, including some high-profile ones: Wordpress, Mega, Namecheap, Coinlab, Coinbase, etc. Of course the list would not be complete if I did not mention the infamous Silk Road marketplace who went from zero to $2 million monthly revenues in bitcoins in its two years of existence. Or the SatoshiDice gambling site who went from zero to $0.5 million profits during 2012.

If the exchange rate drops for whatever reason in the near future, I fully expect press articles announcing, again, "Bitcoin's death". And they will be, again, wrong. Bitcoin's adoption is obviously on the rise. There is a fixed number of them —21 million coins, divisible to the eight decimal— so their value should continue to increase on the long term, because of the laws of supply and demand.

mrb | Wednesday 27 February 2013 at 6:26 pm | | Default | One comment
Used tags: ,

Live Demonstration of Intel "Packet of Death"

Kristian Kielhofner has published interesting details about a bug affecting the Intel 82574L ethernet controller: a so-called "packet of death" that causes the Intel NIC to completely lose link status, until the next cold boot! Here is how you can reproduce the issue:

$ wget http://zorinaq.com/pub/intel-packet-of-death.txt

(If you just click the link to intel-packet-of-death.txt, it probably will not work, because you are already running a full-blown browser that downloaded kilobytes from zorinaq.com, see below.)

As soon as you run this command, your machine should lose network connectivity. In order to maximize the chance of reproducing the bug:

  • Run this command right after a cold boot, because the very first packet of 1152 bytes (or more) that it receives after a cold boot, with byte value 0x00-0x30,0x34-0xff at offset 0x47f, will inoculate the NIC until the next reboot. A desktop system typically downloads kilobytes of data when booting (utilities checking for updates, etc), so any of these large packets have a 253 out of 256 chance to inoculate it.
  • The HTTP response sent by zorinaq.com must be received by your computer as one or more packets where the first one is at least 1152-byte long (counting the Ethernet/IP/TCP/HTTP headers; my file assumes 14/20/20/0 bytes or more for these headers).

mrb | Thursday 07 February 2013 at 12:37 am | | Default | Three comments
Used tags: , ,