<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cerrio</title>
	<atom:link href="http://cerrio.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://cerrio.com</link>
	<description>Real time data in the cloud</description>
	<lastBuildDate>Mon, 12 Sep 2011 19:34:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Node.js HTTPS Post Quirk</title>
		<link>http://cerrio.com/node-js-https-post-quirk/</link>
		<comments>http://cerrio.com/node-js-https-post-quirk/#comments</comments>
		<pubDate>Thu, 08 Sep 2011 18:54:05 +0000</pubDate>
		<dc:creator>Brian Willard</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://cerrio.com/?p=481</guid>
		<description><![CDATA[Recently I have been doing a lot of work with Node.Js. I am a huge fan. But I just spent an hour trying to figure out why my code wasn&#8217;t working: and was getting the following exceptions: Error: socket hang up 13:49:20 web.1 &#124; at CleartextStream. (http.js:1272:45) 13:49:20 web.1 &#124; at CleartextStream.emit (events.js:61:17) 13:49:20 web.1 [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I have been doing a lot of work with <a href="http://nodejs.org">Node.Js</a>.  I am a huge fan.  But I just spent an hour trying to figure out why my code wasn&#8217;t working:</p>
<pre class="brush: jscript; title: ; notranslate">
var https = require('https');
var request  = https.request({
    host: 'api.cerrio.com',
    path:&quot;api/v0.1/r/auth/Sandbox/sessions&quot;,
    method:&quot;POST&quot;},
 function(res) {
    console.log('got callback from sessions');
 }
});
</pre>
<p>and was getting the following exceptions:</p>
<p><code><br />
Error: socket hang up<br />
13:49:20 web.1     |     at CleartextStream.<anonymous> (http.js:1272:45)<br />
13:49:20 web.1     |     at CleartextStream.emit (events.js:61:17)<br />
13:49:20 web.1     |     at Array.1 (tls.js:617:22)<br />
13:49:20 web.1     |     at EventEmitter._tickCallback (node.js:126:26)<br />
</code></p>
<p>After many dead ends the answer turned out to be the path needed to start with a /, so the correct code is:</p>
<pre class="brush: jscript; title: ; notranslate">
var https = require('https');
var request  = https.request({
    host: 'api.cerrio.com',
    path:&quot;/api/v0.1/r/auth/Sandbox/sessions&quot;,
    method:&quot;POST&quot;},
 function(res) {
    console.log('got callback from sessions');
 }
});
</pre>
<p>Hopefully this saves someone some time with the same issue.</p>
]]></content:encoded>
			<wfw:commentRss>http://cerrio.com/node-js-https-post-quirk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Solve Real-Time Software Problems Visually</title>
		<link>http://cerrio.com/solve-real-time-software-visually/</link>
		<comments>http://cerrio.com/solve-real-time-software-visually/#comments</comments>
		<pubDate>Wed, 17 Aug 2011 15:30:51 +0000</pubDate>
		<dc:creator>Richard Bojanowski</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://cerrio.com/?p=469</guid>
		<description><![CDATA[Cerrio is a platform (PaaS) that makes adding real-time data to your applications very easy.  One way we make real-time easy is by taking advantage of visualizations. Real-time software systems, not unlike most software systems, quickly become complex.  Developing under pressure with multiple sources of real-time and static data, different API’s, multiple languages, and new [...]]]></description>
			<content:encoded><![CDATA[<p>Cerrio is a platform (PaaS) that makes adding real-time data to your applications very easy.  One way we make real-time easy is by taking advantage of visualizations.</p>
<p>Real-time software systems, not unlike most software systems, quickly become complex.  Developing under pressure with multiple sources of real-time and static data, different API’s, multiple languages, and new technologies inevitably results in spaghetti code.</p>
<p>Building real-time software systems with traditional technologies is hard.  The problems being solved are difficult and there are many approaches that can be taken.  It is nearly impossible to keep the entire problem space in your head and know how any given change will propagate through these systems.  And once the real-time system is built it is difficult to maintain.  You are required to remember what you, or another developer, was thinking when the system was built.</p>
<p>Real-time software systems are in need of visualization tools.  In fact, they are in need of <strong><em>real-time</em></strong> visualization tools.  Complex problems like those posed by real-time software systems lend themselves well to visualization tools.  The human brain is an amazing organ.  But it is better at solving certain classes of problems than it is others.  Most people can only have <a href="http://www.livescience.com/2493-mind-limit-4.html">three to four items in their working memory</a> at any given time.  Most real-time systems have far more than three or four logical processes and data paths.  So until we figure out a way to install additional flash memory into our prefrontal cortexes, we need to find alternative solutions.</p>
<p>Leveraging the human visual system is a natural fit for this problem.  Humans are a visual species.  We evolved to hunt during the day when we see well and sleep at night when we can’t.  Our brains can quickly parse enormous amounts of visual data and immediately draw our attention to whatever is the most important.  We quickly recognize shapes, colors, patterns and movement without &#8220;thinking&#8221;.</p>
<p>So, the benefits of visualizations are many.  Visualizations allow us to leverage the massive parallel processing capabilities of our visual system.  They allow us to work with data that is represented externally from our working memory and, therefore, process more information.  Visualizations can represent far more data than we can remember and allow us to recognize information rather than recall it.  And, unlike a screen full of code, the visualization can quickly call our attention to where our attention needs to be.</p>
<p>With that in mind, Cerrio built its platform to be visual.  Our visual interface makes building and maintaining real-time software surprisingly simple and intuitive.  In this series of blog posts on visualizing real-time systems, I will explore some of the different ways you can visually work with your data on Cerrio.</p>
<p>But before I call this post done, I would like to lay out an example of a complex real-time system built on Cerrio.  BBR Trading, a Chicago-based derivatives trading firm, has built all of their trading systems on Cerrio.  One of those systems is a risk management system.  This system combines real-time data from the stock and options exchanges, real-time data from their third-party trading venues, flat files from their clearing house, and output from their pricing models among other data.  They use that data to generate a real-time view of their risk and their profit and loss.  This is an example of a solving a very complex problem using Cerrio’s generic, visual components.  Instead of writing thousands of line of code, BBR’s developers built the system that can be visualized in the picture below.  They can look at that picture and quickly see where data is flowing from/to, and any problems that may occur with the data.  Visualization creates the much simpler picture that you see below.</p>
<div id="attachment_470" class="wp-caption aligncenter" style="width: 526px"><a href="http://cerrio.wpengine.netdna-cdn.com/wp-content/uploads/2011/08/RiskManagement1.png"><img class="size-large wp-image-470   " title="RiskManagement1" src="http://cerrio.wpengine.netdna-cdn.com/wp-content/uploads/2011/08/RiskManagement1-1024x954.png" alt="" width="516" height="481" /></a><p class="wp-caption-text">BBR Trading&#39;s risk management system.</p></div>
<p>The developer maintaining BBR’s risk management system can:</p>
<ol>
<li>See how data is flowing through the system</li>
<li>Quickly identify where each process gets it’s data</li>
<li>Quickly identify where data is flowing upstream</li>
<li>Be visually alerted to where problems are occurring in the system</li>
<li>Use visual process control tools to address problems</li>
<li>Quickly review custom diagnostics and performance stats</li>
<li>View documentation embedded into each process</li>
</ol>
<p>In my series of posts on visualizing real-time systems, I will explore how developers of real-time systems can benefit from these visualizations.  If you have any specific questions, just send me an email at <a href="mailto:rbojanowski@cerrio.com">rbojanowski@cerrio.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cerrio.com/solve-real-time-software-visually/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My Latest Typo</title>
		<link>http://cerrio.com/my-latest-typo/</link>
		<comments>http://cerrio.com/my-latest-typo/#comments</comments>
		<pubDate>Tue, 02 Aug 2011 08:02:37 +0000</pubDate>
		<dc:creator>Brian Willard</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://cerrio.wpengine.com/?p=405</guid>
		<description><![CDATA[You’ve probably noticed by now that I make a lot of typos.  And it’s not just when I am writing in English.  I make a lot of typos when I am programming too. I use tools to prevent and mitigate most of my mistakes in code. Resharper, AgentSmith, and the compiler (always use warnings as [...]]]></description>
			<content:encoded><![CDATA[<p>You’ve probably noticed by now that I make a lot of typos.  And it’s not just when I am writing in English.  I make a lot of typos when I am programming too. I use tools to prevent and mitigate most of my mistakes in code. <a href="http://www.jetbrains.com/resharper">Resharper</a>, <a href="http://www.agentsmithplugin.com/">AgentSmith</a>, and the compiler (always use warnings as error) keep me from hurting myself too much.</p>
<p>But today I figured out a new way to screw stuff up:</p>
<pre class="brush: csharp; title: ; notranslate">
            int x = 5;
            x =+ 6;
            Assert.AreEqual(11,x);
</pre>
<p>At first glance you would think this test would pass. But if you are an observant person (unlike me) or take a second look at the order of the =+ operator you&#8217;ll notice I screwed up. I meant to type +=, I was shocked that that would compile, but it does. The compiler treats it as x = (+6) which when written that way seems pretty clear. It turns out that this is just the <a href="http://msdn.microsoft.com/en-us/library/aa691365(v=VS.71).aspx">unary plus operator</a> at work.</p>
<p>Just another thing to keep an eye out for when you write some code that you swear should work but doesn’t. It is nice that in the sample code above ReSharper does warn you that the 5 is not used, in the actual code I was writing you couldn’t tell from static analysis that there was a dead assignment because I was in a loop.</p>
]]></content:encoded>
			<wfw:commentRss>http://cerrio.com/my-latest-typo/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fast .Net BitConverter</title>
		<link>http://cerrio.com/fast-bitconverter/</link>
		<comments>http://cerrio.com/fast-bitconverter/#comments</comments>
		<pubDate>Fri, 29 Jul 2011 18:58:03 +0000</pubDate>
		<dc:creator>Brian Willard</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://cerrio.wpengine.com/?p=403</guid>
		<description><![CDATA[tltd; If you are running .Net on Windows use FastBitConverter instead of System.BitConverter for a 50% speed up. The BitConverter class provides functionality for converting data to and from byte[]. This is very useful for sending data outside of your program. It is the perfect thing to use in almost every situation. The one situation [...]]]></description>
			<content:encoded><![CDATA[<p><strong>tltd;</strong> If you are running .Net on Windows use <a href="http://cerrio.wpengine.com/wp-content/uploads/2011/07/FastBitConverter.cs_.txt">FastBitConverter</a> instead of System.BitConverter for a 50% speed up.</p>
<p>The <a href="http://msdn.microsoft.com/en-us/library/3kftcaf9.aspx">BitConverter</a> class provides functionality for converting data to and from byte[]. This is very useful for sending data outside of your program. It is the perfect thing to use in almost every situation. The one situation where you might want to think about alternatives is if the performance of the calls you are making to BitConverter really matter (and certain conditions are met).</p>
<p>Before you yell at me about how <a href="http://en.wikiquote.org/wiki/Donald_Knuth">premature optimization is the root of all evil</a> let me say that I couldn&#8217;t agree more. Please don’t use the following techniques until you have used a profiler on your application and determined your calls to BitConverter are actually a limiting factor in your application.</p>
<p>Before we get to how to improve BitConverter we need to understand what it is doing. So let’s load it up in <a href="http://wiki.sharpdevelop.net/ILSpy.ashx">ILSpy</a> and look at the method <a href="http://msdn.microsoft.com/en-us/library/system.bitconverter.toint32.aspx">ToInt32</a>, this method takes a byte[] and returns the int that is represent by those bytes.</p>
<pre class="brush: csharp; title: ; notranslate">
public unsafe static int ToInt32(byte[] value, int startIndex)
{
	if (value == null)
	{
		ThrowHelper.ThrowArgumentNullException(ExceptionArgument.value);
	}
	if ((ulong)startIndex &gt;= (ulong)((long)value.Length))
	{
		ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.startIndex, ExceptionResource.ArgumentOutOfRange_Index);
	}
	if (startIndex &gt; value.Length - 4)
	{
		ThrowHelper.ThrowArgumentException(ExceptionResource.Arg_ArrayPlusOffTooSmall);
	}
	int result;
	if (startIndex % 4 == 0)
	{
		result = *(int*)(&amp;value[startIndex]);
	}
	else
	{
		if (BitConverter.IsLittleEndian)
		{
			result = ((int)(*(&amp;value[startIndex])) | (int)(&amp;value[startIndex])[(IntPtr)1 / 1] &lt;&lt; 8 | (int)(&amp;value[startIndex])[(IntPtr)2 / 1] &lt;&lt; 16 | (int)(&amp;value[startIndex])[(IntPtr)3 / 1] &lt;&lt; 24);
		}
		else
		{
			result = ((int)(*(&amp;value[startIndex])) &lt;&lt; 24 | (int)(&amp;value[startIndex])[(IntPtr)1 / 1] &lt;&lt; 16 | (int)(&amp;value[startIndex])[(IntPtr)2 / 1] &lt;&lt; 8 | (int)(&amp;value[startIndex])[(IntPtr)3 / 1]);
		}
	}
	return result;
}
</pre>
<p>Oh man, that’s a lot of code what is it doing:</p>
<ul>
<li>A null check</li>
<li>A check to make sure that your starting index is inside the bounds and is non-negative</li>
<li>A check to make sure there are four bytes to read after the starting index</li>
<li>Add each byte in the array into the final result by bit shifting them into place. The order in which to do it in depends on the endian-ness of the system.</li>
</ul>
<p>If you make two key assumptions:</p>
<ul>
<li>Your arguments are valid</li>
<li>You will always be running on a little endian system</li>
</ul>
<p>You can speed this code up by 50%.</p>
<p>So why are these assumptions safe to make? All the error checking at the beginning of the function is good, I strongly believe in defensive coding, but sometimes it’s not needed.</p>
<ul>
<li>The null check: if you pass null with the null check you get an exception. If you pass null without a null check you still get an exception. It is just a NullRefrenceException instead of a ArgumentNullException. In almost every circumstance I agree getting the ArguemntNullException is better than the NullRefrenceException, but we have already established that this is the .001% case where we care about performance more than almost anything else.</li>
<li>Check of the startingIndex against the length of the array: this is another check that simply provides a better error message instead of changing the behavior of the program. You can see that the third check will never pass (assuming a positive value for the starting index) if the second check fails.</li>
<li>The last check is possibly necessary. If you have already checked, or can statically verify that your arguments will always satisfy this condition then you can remove it. It you can’t do one of these two things than maybe it is needed.</li>
</ul>
<p>So now we’ve taken out the safety checks, that gives us some benefit. The finally way we can speed up this function is by making the assumption that the system is <a href="http://en.wikipedia.org/wiki/Endianness">little endian</a>. The .Net platform can run on lots of platforms but the vast majority of them are little endian. The three main exceptions are the <a href="http://blogs.msdn.com/b/robunoki/archive/2006/04/05/568737.aspx">Xbox</a>, <a href="http://blogs.msdn.com/b/robunoki/archive/2006/04/05/568737.aspx">some</a> Winows 7 Phones and possibly (I haven’t checked this but I think it is the case) running mono on linux on <a href="http://en.wikipedia.org/wiki/Endianness#Endianness_and_operating_systems_on_architectures">some hardware</a>. While I don’t have numbers on the percent of .Net software that is developed on those 3 platforms I feel pretty confident stating that the majority of code that is written for .Net will exclusively ever run on little endian systems.</p>
<p>So the extreamly optimized version of the ToInt32 function is:</p>
<pre class="brush: csharp; title: ; notranslate">
public unsafe static Int32 ToInt32(byte[] data, int startingIndex)
{
    fixed (byte* numRef = &amp;(data[startingIndex]))
    {
       return *(((int*)numRef));
    }
}
</pre>
<p>So here is the <a href="http://cerrio.wpengine.com/wp-content/uploads/2011/07/FastBitConverter.cs_.txt">fast BitConverter</a> file. As long as you are ok with the above two constraints, and you are in one of the rare case where performance really does matter this should help you out.</p>
]]></content:encoded>
			<wfw:commentRss>http://cerrio.com/fast-bitconverter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tools for Real-Time Data Analysts</title>
		<link>http://cerrio.com/tools-for-real-time-data-analysts/</link>
		<comments>http://cerrio.com/tools-for-real-time-data-analysts/#comments</comments>
		<pubDate>Wed, 27 Jul 2011 20:52:28 +0000</pubDate>
		<dc:creator>Ben Blair</dc:creator>
				<category><![CDATA[Real-time Systems]]></category>

		<guid isPermaLink="false">http://cerrio.wpengine.com/?p=406</guid>
		<description><![CDATA[Yesterday O&#8217;Reilly published an interview with Theo Schlossnagle on the state of real time data analysis. In it Theo mentions that a lot of the data analysis that is done now on real time data can be a bit shoddy: I personally see a lot of &#8220;analysis&#8221; happening that is less mature than your run-of-the-mill [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday O&#8217;Reilly published an interview with Theo Schlossnagle on the <a href="http://radar.oreilly.com/2011/07/realtime-data-business-decisions.html">state of real time data analysis</a>. In it Theo mentions that a lot of the data analysis that is done now on real time data can be a bit shoddy:</p>
<blockquote><p>I personally see a lot of &#8220;analysis&#8221; happening that is less mature than your run-of-the-mill graduate-level statistics course or even undergraduate-level signal analysis course.</p></blockquote>
<p>He suggests that</p>
<blockquote><p>&#8230;you need to have analysts&#8230; report to the business side &#8212; marketing, product, CFO, COO &#8212; instead of into the engineering side.</p></blockquote>
<p>I think these are both great points. Right now there are lots of good programmers doing shoddy work on data analysis. On the other hand, I worry that by moving the role of analyzing real time data away from the engineering side without doing something to compensate you&#8217;ll end up with business analysts doing great mathematical work but doing shoddy programming work. We see this all the time in the trading world. With all due respect to the quants out there, there is an aweful lot of fragile systems built from unreadable code by well-meaning quants.</p>
<p>I think there are two problems that have really yet to be solved.</p>
<p>The first thing that is needed are tools that allow the Math PhD&#8217;s to work with real-time data at an abstraction level they are comfortable with. The leap to big data analysis was relatively easy, because that was purely an engineering problem. The leap to meaningful real-time data analysis requires not just better implementations of numerical methods, but better ways of allowing the analysts (not engineers) to examine, change and experiment with real-time data flows <strong>while the data is flowing</strong>.</p>
<p>Once you give analysts the power to really experiment with dataflows the way they are used to experimenting on fixed data sets, they will be able to extract the real meaning hidden in those ever-changing data.</p>
<p>That&#8217;s not enough if you want the business side to be able to understand and act on those conclusions. I think the missing piece of the puzzle here is bringing the kind of beautiful visualizations that have had such an impact lately in the world of big data to the world of fast data. While there are a lot of challenges here, I think ultimately it&#8217;s a solvable problem. The human visual system is fundamentally temporal. I&#8217;m confident some of the amazing data-viz folks out there will find a way to make the output of real-time analysis intuitive and immediately understandable.</p>
<p>Right now however, that can be challenging. It&#8217;s tough to find a good platform that will let a semi-technical user do really powerful things. There are tools that will allow pretty much anyone to do simple things with real time streams. And of course you can have a team of programmers build you a custom solution (and then change it every time the analyst wants something different from the data).</p>
<p>At Cerrio, we&#8217;ve been building a <a href="/">real time platform</a> that will allow you set up a system that can handle complex real time streams of data and processes them in complex ways. It has the power of a custom built solution, but it can be rapidly built without of team of highly paid programmers (or consultants). And it can be easily edited by business analysts as their requirements change without programmer intervention or complicated build/test/deploy cycles.</p>
<p>Try it out for free as part of our <a href="/private-beta-signup/">beta</a>. Or read about some of the things you can build with it in our <a href="/documentation/">documentation</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cerrio.com/tools-for-real-time-data-analysts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Usernames and E-Mail Addresses</title>
		<link>http://cerrio.com/usernames-and-e-mail-addresses/</link>
		<comments>http://cerrio.com/usernames-and-e-mail-addresses/#comments</comments>
		<pubDate>Fri, 22 Jul 2011 19:43:31 +0000</pubDate>
		<dc:creator>Brian Willard</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cerrio.com/?p=348</guid>
		<description><![CDATA[I was recently signing up for a new online service and I had to create an account. The form asked for all the usual things: name, username, E-mail address, and password. Pretty much every service you sign up wants all these things. The problem is some times your standard username is already taken. This is [...]]]></description>
			<content:encoded><![CDATA[<p>I was recently signing up for a new online service and I had to create an account.  The form asked for all the usual things: name, username, E-mail address, and password. Pretty much every service you sign up wants all these things. The problem is some times your standard username is already taken.  This is annoying, then you have to go through all the rigmarole of trying to find a username that is available.</p>
<p>Also just by having more fields you are <a href="http://ux.stackexchange.com/questions/6133/studies-showing-that-the-more-form-fields-there-are-the-less-conversions-you-will">probably lowering your conversion rate</a> and that costs you money.</p>
<p>Which brings me to my question, why do I need a username?  Some of the reasons that come to mind are:
<ul>
<li><Strong>So you can log in</strong>, this is a valid argument if you are not collecting my e-mail address.  If you are collecting me e-mail address then please just let me use that.  Some would argue there there is some extra security in having a separate username because then a hacker has to guess both your username and password.  Your E-mail could <i>in theory</i> be more public than your username.  If people chose random usernames then I could almost buy this argument.  But from my experience most people choose usernames that are either identical to or very similar to their email address or name.  For instance my username is almost always bwillard, or brian.willard, I don&#8217;t think those usernames are adding a lot of security.
<li><Strong>Serving as a unique identifier</strong>, every account needs to have a unique ID so it can be stored in the database.  But as a user I don&#8217;t care what ID you use for me, just generate a GUID or use your DB&#8217;s auto increment feature.  They are much more efficient at creating unique IDs than I am.
<li><strong>A combination of the previous two</strong>, you do need a value to log in with, and you do need* a value that doesn&#8217;t change to store things in the DB.  So a reasonable thought is that e-mail addresses can change so we don&#8217;t want to use that as your username.  But my argument is that your non-changing identifier and &#8216;username&#8217; to login with don&#8217;t have to be the same thing.  E-mail addresses are unique** so I don&#8217;t see any reason you can&#8217;t let people log in with their E-mail address and also let them change it.
<li><strong>Having a name to display on the site</strong>, instead just display my &#8216;real&#8217; name.  If the name you are displaying is for human consumption then a real name is a probably a better choice than a cryptic username anyways.
<li><strong>It allows for a degree of anonymity</strong>, I understand wanting to be anonymous online.  When I give 10 stars to <a href="http://www.imdb.com/title/tt0368891/">National Treasure</a> on IMDB I don&#8217;t necessarily want that to be the first thing people see about me on Google.  But for cases where I care about my privacy I will just lie for my real name.  Unless the site is dealing with money they aren&#8217;t going to want to see a copy of your ID.
</ul>
<p>So I might be the only one that gets annoyed at this.  And I only really notice it when I don&#8217;t get my primary username and I have to spend some time thinking of a new one.  But it also adds a field to your form and probably lowers your conversion rate.  It also increase the amount of information you need to store in your DB.</p>
<p>None of these are huge problems, but they are little problems, and I don&#8217;t really see any up side to having a username that is distinct from your E-mail address.  So lets stop having them.</p>
<p>Am I missing something?  Did you have a good reason to have a username in your system? Let me know in the comments.</p>
<p><small><br />
*You probably don&#8217;t need to assign a non-changing value for each user, but in my experience at least having one is easier than not having one.</p>
<p>**I know some people share E-mail address, I don&#8217;t think this changes the argument, if they share e-mail accounts, which in my mind at least is the most private of all my online accounts, they are probably going to share an account on your service as well.</small></p>
]]></content:encoded>
			<wfw:commentRss>http://cerrio.com/usernames-and-e-mail-addresses/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interesting Observation About Twitter Consistency</title>
		<link>http://cerrio.com/interesting-observation-about-twitter-consistancy/</link>
		<comments>http://cerrio.com/interesting-observation-about-twitter-consistancy/#comments</comments>
		<pubDate>Thu, 21 Jul 2011 23:27:37 +0000</pubDate>
		<dc:creator>Brian Willard</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cerrio.com/?p=336</guid>
		<description><![CDATA[Here at Cerrio we just changed our logo to better represent how we work in the cloud and provide ways to quickly build systems that handle real time data. One of the places we had to update our logo was on our Twitter profile. It was interesting because after we changed our logo it took [...]]]></description>
			<content:encoded><![CDATA[<p>Here at <a href="http://www.cerrio.com">Cerrio</a> we just changed our <a href="http://cerrio.wpengine.netdna-cdn.com/wp-content/themes/Cerrio/images/logo_final.png">logo</a> to better represent how we work in the cloud and provide ways to quickly build systems that handle real time data.</p>
<p>One of the places we had to update our logo was on our Twitter <a href="http://twitter.com/#!/CerrioSoftware">profile</a>. It was interesting because after we changed our logo it took hours (some where around 3) for all of our Tweets to have the new logo on them.</p>
<p>Surprise number 1 was that I wouldn&#8217;t have thought that the logo would have been stored with the Tweet. I would have thought that would have been normalized out.</p>
<p>Surprise number 2 was that Tweets must be partitioned based on id. I would have thought that they might have been partitioned by user. It seems like there is a lot of data locality to take advantage of by storing tweets by user.</p>
<p>Overall I think this is a great example of a problem that requires <a href="http://en.wikipedia.org/wiki/Eventual_consistency">eventual consistency</a>, there is no reason to tax the system just to get a log rolled out everywhere all at once.</p>
]]></content:encoded>
			<wfw:commentRss>http://cerrio.com/interesting-observation-about-twitter-consistancy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Decoupling in Time through Republishing</title>
		<link>http://cerrio.com/decoupling-in-time-through-republishing/</link>
		<comments>http://cerrio.com/decoupling-in-time-through-republishing/#comments</comments>
		<pubDate>Mon, 31 Jan 2011 12:54:09 +0000</pubDate>
		<dc:creator>cerrio</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cerrio.com/?p=228</guid>
		<description><![CDATA[Decoupling in time does not mean that eventually you and your partner will get tired of each other and split up, which is what my wife thought when I tried out the term on her. Sheesh, you forget to change the kitty litter for two measly weeks and some mean thoughts ooze to the surface. [...]]]></description>
			<content:encoded><![CDATA[<p>Decoupling in time does not mean that eventually you and your partner will get tired of each other and split up, which is what my wife thought when I tried out the term on her. Sheesh, you forget to change the kitty litter for two measly weeks and some mean thoughts ooze to the surface.</p>
<p>Decoupling in time is the idea that services in a real-time distributed system ought to be agnostic (decoupled) to when other services are running. Like other dependencies, timing dependencies introduce *unnecessary* complexity, so they should be avoided.</p>
<p>The problem pertains to late-joining transformation services. In a streaming data environment, you might have chained together a dozen services, aggregating some data here, joining in some data there. Downstream data might be dependent on a large number of upstream data producers and converters. If a naïve “middle” service joins a live stream too late, then the data that already streamed by could be lost forever to the consumers down the line.</p>
<p>There needs to be a way for a data transformation service to get the current state of an upstream dataset before joining the live stream of changes, and then “make it right” on the downstream dataset the service produces.</p>
<p>The solution is something we call “republishing”, and it pops up frequently in event-driven data scenarios. Let’s say we’re trying to make a real-time aggregation service that is decoupled in time, for example aggregating trades into a net position. Our aggregator starts late, and its republishing process is:</p>
<p>1) Get all the trades.</p>
<p>2) Performing the aggregation and create a local intermediate net position results dataset.</p>
<p>3) Retrieve the current state of downstream net position records.</p>
<p>4) Determine the difference between the two datasets and then publish changes to the downstream net positions so they are equivalent to the intermediate dataset.</p>
<p>5) Continue to handle trade updates in real-time, now performing real-time aggregation.</p>
<p>The goal with republishing is to maintain downstream data integrity and at the same time provide warm redundancy. Network outages, hardware failure and software bugs happen. Services fail, and in a real-time data environment, you need to be able to bring them back up again quickly without losing the changes that occurred in the meantime. Republishing allows a service to fail, then be brought right back up again, recovering its data state and enabling the system to “knit itself back together”.</p>
<p>I’ve glossed over some important considerations related to republishing that are a bit outside the scope of this blog, but the primary one is data decoupling. In order to carry out republishing, a data transformation service needs to consume the state of both upstream and downstream datasets. This is much easier if the data itself is decoupled from the producer and consumer services (think databases). And for performance reasons, it’s nice if your decoupled data nodes support a dynamic, expressive subscription model.</p>
]]></content:encoded>
			<wfw:commentRss>http://cerrio.com/decoupling-in-time-through-republishing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tease me with the information. Set security aside.</title>
		<link>http://cerrio.com/tease-me-with-the-information-set-security-aside/</link>
		<comments>http://cerrio.com/tease-me-with-the-information-set-security-aside/#comments</comments>
		<pubDate>Thu, 27 Jan 2011 12:53:43 +0000</pubDate>
		<dc:creator>cerrio</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cerrio.com/?p=225</guid>
		<description><![CDATA[For the last month I have been looking at PubSubHubbub, FourSquare, Factual, and Twitter. The barrier for entry is not very high for each but the time to figure out the security is a small headache. Every site have their own security protocol. Most times I just want to see what the output of the [...]]]></description>
			<content:encoded><![CDATA[<p>For the last month I have been looking at PubSubHubbub, FourSquare, Factual, and Twitter. The barrier for entry is not very high for each but the time to figure out the security is a small headache. Every site have their own security protocol. Most times I just want to see what the output of the data looks like and see if it would even be useful at all. I will go through the security protocol for each.</p>
<p>PubSubHubbub is a web hook based pubsub protocol. To subscribe to a PuSH-enabled blog, you need to send an HTTP Post to the hub URL and give certain parameters. One key parameter is the hub.challenge. hub.challenge is a string to verify the verification and you have to echo this back in the response. FourSquare uses OAuth authentication. It allows users to access information without having to give any credentials such as username and password. Factual gives you an unique security code. You use this security code as part of your web request. Twitter allows you to use your account name and password to access their data.</p>
<p>Accessing data through Factual and Twitter is very simple. As a developer the main thing I care about is the data. I want to spend the smallest amount of time possible to figure out how to get into the door to retrieve the information. Yes I understand security is important. From the company&#8217;s perspective you don&#8217;t want people to corrupt or misuse your data. You also want an easy way to deny an user or application. From a developer&#8217;s prospective I go through these security measures to make sure that there is no man in the middle giving me fake data. What about a compromise?</p>
<p>If seeing the data is useful to me, why not just give it to me? Give me access to fake data. Have me set up a username and password. Make it easy. Let me retrieve the data using a curl command. I do not have to write code to use your API. Tease me with the data. I can see the data format. I can see if it is relevant for my project. From there I can decide if I want to commit to you. Twitter does this perfectly. In real time I can see data go across my computer terminal.</p>
<p>As one of these companies, you may ask yourself what do you get in return. One is you will have more developers. You will at least have more people that will try out your system. Why make a developer spend time on something they don&#8217;t care about. I for one will invest time to figure things out if I know the data is relevant to me. If I am not sure and I can not get into the door, I will move on. There are may other similar site to yours. Time is important to me. Time is important to other developers as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://cerrio.com/tease-me-with-the-information-set-security-aside/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Road-runner typing (it&#8217;s like duck typing but faster)</title>
		<link>http://cerrio.com/road-runner-typing-its-like-duck-typing-but-faster/</link>
		<comments>http://cerrio.com/road-runner-typing-its-like-duck-typing-but-faster/#comments</comments>
		<pubDate>Thu, 27 Jan 2011 12:51:57 +0000</pubDate>
		<dc:creator>cerrio</dc:creator>
				<category><![CDATA[C#]]></category>

		<guid isPermaLink="false">http://www.cerrio.com/?p=218</guid>
		<description><![CDATA[Try this experiment for size if you have some free time, appropriate building access, and a few wacky bones in your body. Walk into a Python development shop and loudly proclaim &#8220;People that program in dynamically typed languages do so because they couldn&#8217;t create an interface to save their poor, miserable lives!&#8221; My prediction&#8211;you will [...]]]></description>
			<content:encoded><![CDATA[<p>Try this experiment for size if you have some free time, appropriate building access, and a few wacky bones in your body. Walk into a Python development shop and loudly proclaim &#8220;People that program in dynamically typed languages do so because they couldn&#8217;t create an interface to save their poor, miserable lives!&#8221; My prediction&#8211;you will be immediately awarded a job at the salary of your choosing. Perhaps, not. A more likely outcome is a bunch of (physical) objects flying in the direction of your face. Most of them probably missing it because, as everyone knows, Python programers throw like girls. No, no, I&#8217;m not really a Python hater, nor do I look with disdain on people that program in dynamically typed languages. I&#8217;m nearly trying to demonstrate that most programmers fall squarely into either the dynamically typed camp or the statically typed camp. Some swing both ways but those people are just weird. Well, most of my programming life I&#8217;ve spent in the rigid, statically typed camp and, call me conservative, but I like that extra reassurance that the compiler will prevent me from sneaking in the design inconsistencies I didn&#8217;t realize I had made. It&#8217;s a false comfort you say? Nonsense! Comfort is either experienced or it&#8217;s not. But if it is experienced then it cannot be false. Still, sometimes, just sometimes, I really wish my programs could quack, swim, and walk like a duck. And that&#8217;s precisely the topic I&#8217;d like to discuss henceforth in this post. I know I took my sweet, little time to tell you this but since you&#8217;re still reading then you were probably marginally entertained. Or wondering whether you&#8217;re this weird when you&#8217;re over-caffeinated.</p>
<p>Cheesy intros aside, once in a while a programmer of the statically typed variety runs into a situation when he or she wonders whether their life choices have led them in the right direction and whether being dynamically typed might have proven to be an easier lifestyle. One such situation occurred here when we were trying to figure out how to efficiently translate between the serialized messages &#8220;on the wire&#8221; and our program constructs.</p>
<p>The problem was that our clients and servers were exchanging variable-length messages without having type information at compile time. (Technically, the clients do have it but the servers don&#8217;t which means it&#8217;s still problem that needs solving!) To make matters worse, a client&#8217;s view of the data might be out of alignment with the server&#8217;s. Yeah, that was a confusing way to put things just now but we&#8217;ll just work through this together. Let&#8217;s say a server was sending a Person record { BrainChipId, FirstName, LastName, AgeInMilliseconds } to a client. It would encode it using some (in our case über efficient!) binary encoding. Great! Meanwhile, the C# client has a type called Peon { BrainChipId, AgeInMilliseconds, MothersMaidenName }. Clearly they have some fields in common but they also have some out common. Yeah, I know it&#8217;s not a real phrase. English is so asymmetricalacirtemmysa! So the client and the server agree that it&#8217;s good enough since there&#8217;s an intersection of fields and wish to continue talking. This would be a perfect fit for duck typing but for its absence from C#. So some improvisation is called for. We can resort to Reflection but in performance-critical situations it would be like taking the scenic route to your destination. We can also represent each record as a dictionary keyed on field names. This would be conceptually close to some dynamically typed languages but each set/get would be a hash table access and, although, hash table lookups are O(1) we would much prefer something that&#8217;s o(1). This is where .NET&#8217;s facilities for generating dynamic types and methods come in very handy (see <a title="System.Reflection.Emit" href="http://msdn.microsoft.com/en-us/library/system.reflection.emit.aspx">System.Reflection.Emit</a>). Our messaging layer was fitted with logic to emit getters and setters for marshaling data between the serialized, &#8220;on the wire&#8221; representation and the business logic type instances. If this sounds extremely abstract, it&#8217;s probably because it is. In the context of our meager example, the client creates an instance of the Peon type, sets its values and invokes an API call on the messaging layer (doesn&#8217;t that sound impressive?) requesting a new Peon to be added to the multi-million strong list of drones. The messaging layer invokes the make-it-happen method and the magic happens. Wouldn&#8217;t that be nice? In reality, though, once the client connects to the server, the server sends it record layouts of all of its data types so the messaging layer has something to work with. So there is enough information to construct a proper message to the server from the data contained in the submitted Peon instance and it uses the dynamically-constructed getters to extract it.</p>
<p>Pretty much the reverse happens when the client receives a message from the server. It creates a new Peon instance and fills it with the data from the message by invoking the setters it had pre-emitted. Then it invokes a callback to let the business logic know that it &#8220;got mail&#8221;. Beautiful and efficient. We&#8217;ve concocted a cozy illusion of duck typing&#8217;ness in our statically typed C# world.</p>
<p>If it types like a duck but goes about its business statically then it must be road-runner typing!</p>
]]></content:encoded>
			<wfw:commentRss>http://cerrio.com/road-runner-typing-its-like-duck-typing-but-faster/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

