Monday, June 13, 2011

New Blog, Step Three: Privacy!

This summer I have an internship with the Tor Project through Google Summer of Code. I started a blog for that and to generally talk about privacy stuff which will never be a profitable endeavor. Ironically, I'm getting paid by Google to work on it, but for Google this is an entirely non-profit endeavor in the name of charity and the general betterment of the world.

Privacy and profit have something in common in that each is an elusive goal. Thanks to my friend Drake Wilson for suggesting the related name for the new site, Step Three: Privacy!

I'll still be posting on here when I have startup and coding related matters to discuss, although I would recommend coders check out the other blog as well as there's some neat stuff on there if you do networking stuff in python.

Saturday, January 29, 2011

How to Help in Eqypt: A Historical Perspective and a Call to Action

It's really amazing how the censorship bar keeps getting raised. When I co-founded Freenet over ten years ago, there were lots of assumptions shared about what online censorship was and how far people would be willing to go and also about what free speech was and what people wanted to communicate online. These assumptions have carried through to the design of today's censorship resistant systems. For instance, Tor still uses SSL we used to think that no one in their right mind would block SSL because then they'd be blocking HTTPS and critical systems such as any online commerce. The essential assumption was that there was a certain level to which censors would not go and we just needed to hide our traffic below that level. This was a good assumption for a long time, but now the game has changed.

Iran was the first wake-up call. They went farther than China was every willing to go by severely throttling SSL specifically. This was really a smart move because it didn't slow down the ability to read pages on the Internet, as most are unencrypted. It did slow down Tor and the ability to log in to any sites that use SSL for logins (hopefully all of them at this point). Since logins are required for most publishing services such as email, Twitter, Facebook, etc., this throttled both the ability to send information out. Of course online commerce was affected, but they were willing to accept that. The Iran attack became the new gold standard in online censorship. All systems need to adapt to this new reality. Since SSL is now a target, SSL is no longer a good wrapper for traffic. This is why I started Dust, to provide a more modern transport layer for bypassing current censorship methods. I really believe we can make something which is undetectable and thus cannot be throttled or blocked. I think information theory is on our side here and that this is a war we can win.

However, Egypt raised the bar yet again by simply unplugging the Internet. This is remarkable as it not only shows how much farther censors are willing to go now, but also how the nature of online freedom of speech has changed. One of the classic examples we used to use to discuss the purpose of Freeenet was that China blocked access to CNN. This seems like a comically naive goal now. People aren't trying to access news from major publishers. They're organizing protests via Twitter. This is totally decentralized content, it's peer-to-peer communication.

Unfortunately, this isn't a software fix. If the cables aren't plugged in, there's no clever ways we can encode the data to get it past the censors. This particular situation is a hardware problem. The infrastructure is centralized in such a way that it's easy for the government of a country to just switch it off and so this is what happened. Some respected individuals have called for the building of a new Internet without these problems.

I think this is a very noble endeavor, but I want to be straight with you about the problems with this idea. Essentially, we've tried this and it doesn't work. We've been trying this for ten years. Building an Internet out of Wifi mesh points is like wiring a city for electricity using USB cables. The 3G and 4G wireless Internet that we have now is connected by a high speed wired backbone and this is what makes it work well. There are many problems with a entirely mesh network, but the primary one is range. Once you start looking at coverage areas and doing the math you quickly discover that the number of mesh nodes required to cover any decent areas is astronomical, particularly because you need to connect out to the larger Internet either by crossing the border to a friendly nation or connecting to a satellite link.

There is something we can do, though. We can design a custom network for these situations which, while it doesn't connect to the general Internet, provides network connectivity to people on the ground with each other. Here's a brief overview of my design:

Femtocells are superior to Wifi access points here. The computing device of choice is going to be the camera and GPS-equipped phone, not the laptop. Phones are used to drifting from tower to tower. They have hand-off protocols for switching towers seamlessly. That's why you can talk on the phone while driving down the highway. A femtocell is essentially a "fake" cellular phone tower that intercepts your phone signals and routes them over your own network connection instead of the phone company backbone. You normally get these to improve reception in areas with poor or non-existent tower coverage. Places like, for instance, Egypt right now.

My proposal is to combine portable, battery-powered femtocells with a custom backend that, instead of routing your data packets over an ethernet connection, stores the data for exchange on a store-and-forward mesh network, much as in the FidoNet network referred to in the Rushkoff article. Then, instead of having fixed towers and moving phones we have moving towers.

This is all kind of technical, I suppose, so let me break it down for you in terms of a use case. You're in Egypt and you want to get news about what's going on, send out videos of important happenings to the world, and organize with your fellow citizens to take political action. You have a phone with a camera and text and MMS messaging. There are mobile cell phone towers roaming around the city. (This is something you'd need to organize and is a whole issue in itself, but some people specialize in this kind of theory. It's solvable.) When you come in range of an access point, you can send text and MMS messages. You also receive any that have been sent to you. The access point is actually another citizen with a backpack femtocell and battery. They could be walking, although I've also seen a similar plan executed using motorcycles. The tower stores your sent messages. The towers move in such a pattern that they come into range of each other. At this point they exchange stored messages. When a tower comes in range of a phone for which it has stored messages, it sends them to the phone and then deletes them. Sending messages to people in your phone contact list works the same as always. Getting information out of the country just requires sending a text or MMS message to someone that is known to have a satellite, dial-up, or other link outside. Once the information moves through the mesh to them, they can send it on.

I think this is the right way to do decentralized mesh networking in situations like what is happening now in Egypt. This is something we can build right now. I'm ready to start on this whenever you are. The first step is that we're going to need some femtocells. After that, it becomes a software problem again, like hacking a Wifi router to run OpenWRT.

If this is a topic you're interested in, I will be giving a talk about this on March 11 at the Dorkbot SXSW event: The Vision of the Future: 2021. Come by and say hi and we can figure out how to make this happen.

Wednesday, January 5, 2011

Retro Indie Game Development with HTML5 - The Series

Like many web developers I've become interesting in the recent developments in HTML5. Web browsers can now do things they've never been able to do and it's an exciting time. It's also an exciting time for games right now. Two phenomena in game development have arisen which make game development fun again: retro games and indie games, although they are often found in conjunction. Retro games use the old school 8-bit graphics and sound we loved when we were kids. Some of these retro titles are from studios such as the new Megaman games or Cave Story for the Wii. There has also been a rising tide of indie games such as Minecraft and the games of the Humble Indie Bundle such as Braid. Some of these have retro graphics and some like Machinarium have pretty nice art. Not to save that low resolution art isn't nice, "pixel art" has become its own genre with its own talented artists. Some of these indie games have actually been quite successful with both Minecraft and the Humble Indie Bundle raising millions of dollars in sales.

I think the simultaneous rise of HTML5, the retro style, and the commercial success of the indie development methodology have created a great opportunity for the hacker turned entrepreneur. Additionally, all of the open source code, DIY tools, and Creative Commons licensed artwork provide a relatively low barrier to entry. Take, for instance, Realm of the Mad God. This is a really fun retro indie MMO. It was actually developed as part of a contest where artists first made Creative Commons licensed art assets and programmers then used these to make a game. The map generation code is also open source. You could go out and write a game like this today and, even better, you don't have to do it in Flash like they did. A game like this could be written in pure HTML/CSS/Javascript, which is good news for web developers that have already been working in this medium for a while.

My plan is to write a series of posts about all of the great things I've discovered about developing retro indie HTML5 games as I've been working on my own game. While this may seem like a very specific topic, it opens up the doors to a variety of topics with nice concrete examples. For instance, I've often wondered, HTML5 sounds cool I guess but what is it good for, actually? In developing my game, it became quite apparent that it would be pretty much impossible without some very specific HTML5 features, not the obvious things like the Canvas and Audio APIs, but specifically Web Workers have been indispensable.

So watch the blog for future posts in this series. I'm going to start with Akihabara, the HTML5 game library specifically designed for retro games. Also let me know if there's anything specific you're interested in it and if I something to share on the subject then I'll try to make a post about it.


Thursday, June 17, 2010

The Original Introduction Problem in P2P Networks

BitCoin was released this week, a very interesting P2P currency based on proof-of-work with a novel method to deal with double-spending via a P2P timestamp server. Cool stuff.

On the BitCoin forums, a discussion was going on regarding how new BitCoin nodes connect to IRC in order to find other BitCoin nodes. This method was somewhat controversial because it was drawing the ire of the IRC network admins because it looked like they were running a botnet. Additionally, if the IRC server goes down then new users can't join the BitCoin network. However, what are you going to do? When you first run a node, it doesn't know about any other nodes. It's a tough situation.

This is a common problem in P2P, known as Original Introduction, although bootstrapping is also a good word for it. The problem with bootstrapping is that you can't decentralize it. Whether it's IRC or HTTP or DNS, the client needs to be hardcoded with an address or list of addresses which is sufficiently fresh that at least one of the listed addresses is still active. After the first node is reached, you are no longer in Original Introduction mode and can use the full range of techniques for decentralization, such as gossip. Unless, of course, you get disconnected from the network and all of your known peers go away, in which case you're back to bootstrapping.

There are two properties that are at odds when you chose a bootstrapping method: robustness (scalability/reliability) and freshness. Robustness is increased at the expense of freshness by caching on multiple servers, as is usually done with HTTP peer lists. Freshness is maximized (at least up to the TCP timeout) at the expense of robustness by having everyone connected, as with IRC. Of course, the key is finding the right mix of robustness and freshness because you need both for the bootstrap to be successful.

Here are some of my current favorite methods for bootstrapping:

Append list of fresh peers to executable or installer dynamically on download. People usually get the application from its official website, so the website is already a point of failure for new users. You're already hardcoding an address in the application, the address that the application will use to bootstrap. So instead just add fresh peers at the moment of download. You need some fancy code in the executable to read the list off the end, but I've implemented this in an NSIS installer and it's not that hard. Most software developers are upset by the idea of this method.

Connect via XMPP to Google App Engine application. This gives the freshness of IRC, but with more robust scaling. App Engine is mostly for writing web apps, but it provides email and XMPP handling as well. It would be simple to write one application that could handle peer lists via either XMPP or HTTP with the same handler code. I'm currently using this in an application and it works well and is very reliable. I only wish there was a second App Engine to use as a fallback because it does have occasional downtime.

An alternative to requiring all nodes to include the complexity of a protocol like IRC or XMPP is to have a few special sentinel nodes which sit on the network and collect addresses of connected nodes via the usual decentralized methods available to an active node. These sentinel nodes periodically upload fresh addresses, say via HTTP POST to a number of websites. A new node can then download a fresh address list from any of the websites which is currently functioning and reachable. If you have 5 sentinels each uploading every 5 minutes (staggered), then you'll have updates roughly once a minute. This is on par with IRC in terms of freshness and is robust as you care to make it by varying the number of HTTP mirrors and the number of sentinels.

Monday, June 7, 2010

The Truth About Mobile Bandwidth Pricing

AT&T just ended unlimited bandwidth for the iPhone and people seem to be confused about what this means. As a follow-up to my post on consumer bandwidth pricing, let me break down the mobile bandwidth pricing strategies for you.

It's not really a cap, it's a pricing strategy. Also, it's not about keeping a few extreme users from ruining the network for everyone. For congestion management you'd need peak usage pricing like electricity companies use, only for geographical areas instead of (or in addition to) time-based pricing. For instance, raise the price of bandwidth in Manhattan during daytime and at the Austin Convention Center's cell tower during SXSW. Cumulative usage-based pricing doesn't solve congestion. It's just a strategy to raise prices.

Here's the breakdown of how much you'll pay per month depending on your data usage on the various networks that support smartphones.


As you can see, AT&T starts low and then after the 2GB "cap" quickly cuts across all the prices of the carriers that offer unlimited bandwidth. If you actually use less than 2GB/month, it's still a pretty good deal, second only to Sprint. At 4GB/month, it's the most expensive.

Also notice that Tmobile is more expensive if you get a 2-year contract that if you have no contract. This is their terrible new pricing plan in which they no longer subsidize phones in order to lock you into a contract. Instead, they essentially finance your phone by having you pay less up front but then more per month. When your 2-year contract is up, you will have paid more than you saved on the initial phone purchase. So if you get a Tmobile phone, don't get a contract. Just buy the phone outright.

Tuesday, April 21, 2009

The Truth About Consumer Bandwidth Pricing

There's been a lot of noise made recently about Time Warner instituting bandwidth caps. Everyone was angry at Time Warner, whereas Time Warner claims it's losing money because of a few people hogging all the bandwidth, that usage based pricing is more fair and also necessary to pay for building up their networks, and that all of this BitTorrent traffic and streaming video is killing their networks and needs to be capped.

I have an inside perspective on this matter because when I was the Director of Product Management at BitTorrent, we often spoke with ISPs. We knew that Comcast was throttling BitTorrent traffic far before it made it into the news and I flew down to Comcast headquarters in Philadelphia to discuss the situation. I was suprised when the told me that they had plenty of bandwidth and that BitTorrent wasn't anywhere close to crushing their network. Their problem was that they don't want to sell bandwidth, a comodity with a price racing to zero. They want to sell entertainment services, which have a higher profit margin. They are therefore threatened by online video as it competes with cable TV.

The consumer ISP strategy thus has a twofold purpose: raise the price of bandwidth, and at the same time make the Internet a less appealing way to watch video. Both of these purposes are accomplished by bandwidth caps. Additionally, the new pricing models make it complicated to determined how much you're going to be paying exactly for bandwidth, allowing the ISPs to increase prices covertly. If they were to just declare that prices were going up because they felt like it, people would be very angry indeed, and it might lead to government regulation of pricing.

In order to unravel the mystery of the new pricing models, I've made some graphs that show how much you will pay in dollars for a number of total gigabytes transferred in a month. I was very suprised by the results.

To start, here is a graph of a lot of different plans, such as various Time Warner plans, AT&T DSL, and the main 3G mobile carriers.




On the bottom is gigabytes and on the left is dollars. Yes, dollars. 300 GB would costs you $140,000 on AT&T 3G. You'll notice that only the 3G providers show up at all, everything else being squished into a single line on the bottom. This is because while Time Warner is charges overages of $1/GB, Sprint is charging $50/GB, Verison $280/GB, and AT&T a ridiculous $480/GB after you exceed the 5GB cap. Everyone is mad about the Time Warner caps, but it's really the 3G caps that are totally insane. Every iPhone user is on AT&T, so when Hulu for iPhone comes out it's going to be crazy.

So don't use more than 5G of 3G per month or else you're getting ripped off. Let's compare some ISPs just in the 1-5G range to see how they stack up.



Amazon S3 is included here at the bottom just to show how much more expensive consumer bandwidth is than hosting bandwidth. The bottom tier of Time Warner service is a clear winner here, following by the original capper Comcast. 3G services are in the middle, with premium tier cable and DSL services losing. In this bandwidth bracket, you don't really get much benefit from upgrading your service.


Now let's look at ISP choices excluding 3G.



The lowest Time Warner tier wins again if you lose little bandwidth, and then Comcast wins everything else up to 250G where they have put a hard cap.


Now let's look in depth at just the Time Warner tiers.


The graph is interesting because Time Warner imposes an overage fee cap of $75. This causes the lowest tier to come out best for both low and high numbers of gigabytes. The lowest tier charges $15/month for 1GB and $2/GB for each additional GB, up to $75 in overages, meaning that your total bill is capped at $90. You therefore get unlimited bandwidth for $90 with that plan. Whereas their highest tier plan is $75 for 100 GB and then $1/GB after that up to $75 in overage charges. You get unlimited bandwidth for $150 with this plan. So the lowest tier wins and the highest tier loses. The middle tiers only come into play for medium amounts of bandwidth.

So, let's look at medium amounts of bandwidth where the multiple tiers come into play.


This graphs shows a situation similar to the one pitched by Time Warner. There are multiple tiers and you get the best deal by choosing the right tier for the amount of bandwidth you use. However, note that the goal is not to avoid overages. The goal is to avoid having your overage charges cost more than the monthly charge of the next plan up. So while the lowest tier only includes 1GB/month, it's the best plan up to around 10GB/month. Similarly, the standard plan will be better than an upgrade up to 50GB/month. The highest tier is only good for people that use >80 GB/month. And Time Warner Business Class is, as shown on all of the graphs, always just a terrible deal.


It was just discovered that AT&T DSL is implementing bandwidth caps. They have a different model because they don't have a cap on overage fees. That sounds like it would probably be a worse deal than Time Warner. Let's take a look, first at just the different AT&T DSL tiers.


This is the more classical model that you'd expect with overages. Since there are no caps on overage fees, you get the best deal by choosing a plan matched to your usage. If you guess incorrectly, you overpay. The ordering of plans from cheapest to most expensive becomes inverted from low usage to high usage.

Now let's compare the various AT&T DSL plans to the various Time Warner cable plans.


There are a lot of lines on this graph, but you only need to look at the bottom. The lowest tier of Time Warner again wins for low bandwidth. After than, successive AT&T DSL plans win. Despite the fact that their pricing structure is worse, their actual prices are better than Time Warner as long as you're good at guessing how much bandwidth you're going to use. If you're bad at guessing, only the lowest two tiers of Time Warner could ever possibly be better than AT&T DSL and only for a small range of usage. So if you're bad at guessing your usage, your best bet is to get the highest tier of AT&T DSL.

Conclusions

I was suprised by the outcome of these charts. The Time Warner caps are not that big of a deal and the AT&T caps are even less of a big deal. What you really need to watch out for is the 3G caps. Those are just totally off the rails.

The best deal for consumer Internet is AT&T DSL, even with the caps and overage fees. If you know how much bandwidth you're going to use, buy the appropriate tier. If you don't know how much bandwidth you're going to use, you're safest buying the highest tier.

If you're going to go with Time Warner, the lower tiers are a better deal. Go with the lowest tier you can and only upgrade if your overage fees are costing you more than the next tier. Never buy the highest tier or business class, they are ripoffs.

3G is a terrible deal. If you use less than 5G a month, all the 3G providers are priced the same and are not a very good deal for Internet. Use the lowest tier of Time Warner instead. Under no circumstances use more than 5G of 3G in a month, you will get ripped off big time.

Also, Hulu for iPhone is going to be a train wreck.

Friday, February 27, 2009

Diakonos: A Programmer's Text Editor in Ruby

A text editor (or for some an IDE) is the most important tool a programmer has, other than the programming language itself. Religious wars over editors are inevitable because people spend so much time with their editor. Some people flip-flop, but many people become both functionally and emotionally attached. No one wants to spend time learning new keybindings when they could be programming instead.

Personally, I use nano. This is not out of ignorance, mental damage, or a deep moral perversion as my friends that use emacs and vi insist. I want an editor which is small and quick to install. It must be available on all platforms and easy to install (if there's no Debian/Ubuntu package in the main repositories, forget it). I'm not going to mess around with configuring it. And I basically just don't like vi. So nano has been winning the war for my soul for many years. However, like all programmers, I dream of a better world. I wouldn't mind a slightly (or even somewhat) better editor, but I everything I've ever tried lacked the beautiful simplicity of nano. With more features comes more hassle.

Then I found Diakonos. It's a console-based text editor (which I like because I ssh into my server and edit things as much as I edit them locally), and it's written in Ruby. It has the modern features, such as multiple buffers, syntax highlighting, and syntax-aware indentation. It's scriptable, either through the Ruby interface or through external programs (in any language) which are fed the old buffer on stdin and output new buffer contents on stdout.

Like all editors under my consideration, it has packages in the main repositories of both Debian and Ubuntu. It also has Windows and OS X binaries (also a Ruby gem for you Ruby guys). It's as quick and easy to install as nano, and through it has lots more features, they are not obtrusive. The keybindings are the "standard" Windows-style ones (ctrl-x cut, ctrl-c copy, ctrl-v paste). You can of course configure it to emacs or whatever style you want, but I am personally happy to use a similar set of keys across my editor and web browser.

I am particularly excited about finally having an editor that's not written in C. This is a personal issue. Many people like C, but I just think it's time for us to move on as a society. I have a T-shirt that says "I would code in C for love, but not for money." While you may love C, autoconf, and make, I am personally very excited about an editor both written in and scriptable in Ruby. It seems like a step towards the future. It's also nice to have a fresh codebase which doesn't inherit several decades of design decisions.

My apologies for insulting your favorite text editors and programming languages, my Internet friends. I meant no harm. Just check out Diakonos for a bit and see what you think. It has a feel which is both fresh and yet somehow also classic. A "modern classic" if you will. And it's fun. In a way I can't really articulate, it's just enjoyable to use. Also, the author is a really nice guy and the IRC channel isn't full of obnoxious jerks (#mathetes on freenode), just good folks like you and me, hacking on code. I'll see you there!