<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Matthew Newhook</title>
	<atom:link href="http://zeroc.com/blogs/matthew/feed/" rel="self" type="application/rss+xml" />
	<link>http://zeroc.com/blogs/matthew</link>
	<description>Matthew Newhook weblog</description>
	<pubDate>Thu, 11 Feb 2010 23:02:13 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
			<item>
		<title>Ice in 20 minutes</title>
		<link>http://zeroc.com/blogs/matthew/2009/04/03/ice-in-20-minutes/</link>
		<comments>http://zeroc.com/blogs/matthew/2009/04/03/ice-in-20-minutes/#comments</comments>
		<pubDate>Fri, 03 Apr 2009 15:47:20 +0000</pubDate>
		<dc:creator>matthew</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://zeroc.com/blogs/matthew/?p=13</guid>
		<description><![CDATA[It&#8217;s been a while. Since I last posted, we&#8217;ve been busy! We&#8217;ve released Ice for the iPhone, Ice for Android, an Eclipse plug-in, and last but not least, Ice 3.3.1. Of course, the project that we&#8217;re still most excited about is Ice itself. To help get you, the reader, as excited as I am, I&#8217;ve [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been a while. Since I last posted, we&#8217;ve been busy! We&#8217;ve released Ice for the iPhone, Ice for Android, an Eclipse plug-in, and last but not least, Ice 3.3.1. Of course, the project that we&#8217;re still most excited about is Ice itself. To help get you, the reader, as excited as I am, I&#8217;ve spent a bit of time on a whirlwind tour de force screencast introducing Ice. Ice in 20 minutes, from installing it on your machine, to having a fully working Java and Python application.</p>
<p>See <a href="http://www.zeroc.com/doc/screencasts.html">http://www.zeroc.com/doc/screencasts.html</a> for the screencast itself. I hope you enjoy it, and look for more in the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://zeroc.com/blogs/matthew/2009/04/03/ice-in-20-minutes/feed/</wfw:commentRss>
		</item>
		<item>
		<title>How many clients can IceStorm support?</title>
		<link>http://zeroc.com/blogs/matthew/2008/08/20/how-many-clients-can-icestorm-support/</link>
		<comments>http://zeroc.com/blogs/matthew/2008/08/20/how-many-clients-can-icestorm-support/#comments</comments>
		<pubDate>Wed, 20 Aug 2008 15:08:57 +0000</pubDate>
		<dc:creator>matthew</dc:creator>
		
		<category><![CDATA[Performance]]></category>

		<category><![CDATA[Ice]]></category>

		<category><![CDATA[IceStorm]]></category>

		<guid isPermaLink="false">http://zeroc.com/blogs/matthew/?p=12</guid>
		<description><![CDATA[… or how long is a piece of string?
From an IceStorm perspective, there are two types of clients:

Publishers: Clients that publish messages to topics.
Subscribers: Servers that receive messages from a topic.

The exact number of supported publishers and subscribers depends on the total load imposed by them. Load is composed of several factors.
The first is message [...]]]></description>
			<content:encoded><![CDATA[<blockquote><p>… or how long is a piece of string?</p></blockquote>
<p>From an IceStorm perspective, there are two types of clients:</p>
<ul>
<li>Publishers: Clients that publish messages to topics.</li>
<li>Subscribers: Servers that receive messages from a topic.</li>
</ul>
<p>The exact number of supported publishers and subscribers depends on the total load imposed by them. Load is composed of several factors.</p>
<p>The first is message throughput, expressed as events per second (EPS). Message throughput is a function of latency and the total number of raw messages sent. By way of example:</p>
<ul>
<li>You have 1 publisher publishing 1 EPS to 1 subscriber: throughput is 1 EPS.</li>
<li>You have 1 publisher publishing 10 EPS to 1 subscriber: throughput is 10 EPS.</li>
<li>You have 1 publisher publishing 10 EPS to 10 subscribers: throughput is 100 EPS.</li>
<li>You have 10 publishers publishing 10 EPS to 10 subscribers: throughput is 1000 EPS.</li>
</ul>
<p>The second factor that influences load is message size. By multiplying EPS with message size, you can roughly determine the amount of bandwidth IceStorm will consume.</p>
<p>Bigger events also mean that IceStorm&#8217;s memory consumption will increase. However, memory use is not a primarily a function of the number of subscribers, but a function of the number of publishers and the number of events that are queued for subscribers. A single event published to one subscriber will only consume marginally less memory than a single event that is published to one hundred subscribers. This is because the memory allocated to an event is shared among all subscribers. Each subscriber has a separate event queue that points at the event, so queuing an event for an additional subscriber only consumes memory to store a pointer to the event in the queue, not memory to store an extra copy of the event.</p>
<p>IceStorm provides four quality-of-service modes for event delivery: oneway, twoway, twoway-ordered, and batched. Of these, oneway is fastest, followed by twoway, and then twoway-ordered. In particular, oneway events (if you can live with the limitations imposed by them) provide a big increase in EPS. For high-latency networks, you can also set batch delivery, which increases overall throughput by combining several events into a single physical protocol message. The cost is that events arrive at subscribers in bursts, and the bursts are separated by longer gaps than for unbatched delivery. For low-latency networks, batching does not provide any benefits. (See <a href="http://www.zeroc.com/Ice-Manual.html">Ice Manual</a> for more information on batch delivery.)</p>
<p>The best way to determine IceStorm&#8217;s maximum raw throughput is to run a latency test for the delivery mode you intend to use. For example, if you intend to use twoway delivery, run the C++ version of the twoway latency demo on your machine (see <span style="font-size: 10pt;font-family: Consolas">cpp/demo/Ice/latency</span>). This demo measures the latency of sending twoway messages on a single machine. On one of my dual-core machines, the latency demo measures 0.149ms per message, which works out to 6,716 twoway messages per second. If you have more than two cores, you will achieve higher throughput (but throughput will not increase linearly with the number of cores).</p>
<p>To get a feel for the maximum bandwidth, you can run the C++ throughput demo (<span style="font-size: 10pt;font-family: Consolas">cpp/demo/Ice/throughput</span>). Of course, this demo does not take into account the physical bandwidth limitations imposed by your network because client and server communicate over the loopback interface. However, you can also run the throughput client and server on separate machines to see how your network affects bandwidth.</p>
<p>To get optimum throughput from IceStorm for your application, you will need to play with the IceStorm configuration settings. The main setting you will be concerned with is the number of threads in the publishing thread pool. This setting determines how many events can concurrently be received and published. (Note that, if you are publishing messages to IceStorm with a oneway proxy and also require strict event ordering, you cannot configure more than one thread in the IceStorm thread pool.)</p>
<p>A good starting point for the thread pool size is to set the size equivalent to the number of cores on your IceStorm host. For example, assuming the name of the service is IceStorm, and IceStorm runs on a four-core machine, define the following:</p>
<p><span style="font-size: 10pt;font-family: Consolas">IceStorm.Publish.ThreadPool.Size=4<br />
</span></p>
<p>As you can see, the question of how many clients IceStorm can support cannot be answered in isolation. To obtain a realistic estimate, you have to understand the application structure, the load imposed on IceStorm, the network and bandwidth restrictions, as well as the hardware on which IceStorm executes. Running benchmarks in an environment that resembles the deployment configuration as closely as possible is the best way to determine the total possible load that can be supported. The general guidelines I&#8217;ve outlined above should point you in the right direction.</p>
]]></content:encoded>
			<wfw:commentRss>http://zeroc.com/blogs/matthew/2008/08/20/how-many-clients-can-icestorm-support/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Glacier2 Scalability Improvements in Ice 3.3</title>
		<link>http://zeroc.com/blogs/matthew/2008/07/31/glacier2-scalability-improvements-in-ice-33/</link>
		<comments>http://zeroc.com/blogs/matthew/2008/07/31/glacier2-scalability-improvements-in-ice-33/#comments</comments>
		<pubDate>Thu, 31 Jul 2008 18:23:58 +0000</pubDate>
		<dc:creator>matthew</dc:creator>
		
		<category><![CDATA[scalability]]></category>

		<category><![CDATA[asynchronous]]></category>

		<category><![CDATA[glacier2]]></category>

		<category><![CDATA[Ice]]></category>

		<guid isPermaLink="false">http://zeroc.com/blogs/matthew/?p=10</guid>
		<description><![CDATA[Glacier2, the Ice firewall traversal service, has two modes by which it forwards requests and replies between clients and servers—buffered and unbuffered. In unbuffered mode, the thread that receives a request or reply forwards it immediately. In buffered mode, requests and replies are queued and delivered by a separate thread.  With Ice 3.3, buffered [...]]]></description>
			<content:encoded><![CDATA[<p>Glacier2, the Ice firewall traversal service, has two modes by which it forwards requests and replies between clients and servers—buffered and unbuffered. In unbuffered mode, the thread that receives a request or reply forwards it immediately. In buffered mode, requests and replies are queued and delivered by a separate thread.  With Ice 3.3, buffered mode is required only if an application needs <a href="http://www.zeroc.com/doc/Ice-3.3.0/manual/Glacier2.40.9.html">request batching</a> and <a href="http://www.zeroc.com/doc/Ice-3.3.0/manual/Glacier2.40.9.html">request overriding</a>. In contrast, with all prior versions of Glacier2, buffered mode was essential.</p>
<p>Before Ice 3.3, the purpose of Glacier2&#8217;s buffered mode was primarily to isolate clients from each other. Because any request could block (asynchronous or otherwise), in buffered mode, Glacier2 spawned two threads per connected client. One thread was used to forward requests from the connected client to the back-end, and the other was used to forward replies from the back-end to the client. In unbuffered mode, any misbehaved application (whether malicious or just faulty) could block an invocation, causing denial of service to all other connected clients.</p>
<p>Because buffered mode was the only way to provide isolation, it was used by virtually all front-facing applications. However, real-world use showed that once several hundred clients were connected, Glacier2 suffered scalability issues from context switching between threads as well as the limited virtual memory space in 32-bit operating systems.</p>
<p>With the introduction of true non-blocking asynchronous invocations in Ice 3.3, we re-architected Glacier2 to take advantage of this feature. Given that this eliminates all the buffering threads, Glacier2 should theoretically realize a big improvement in scalability. I set about testing this during the Ice 3.3 development process.</p>
<p>To test scalability, I wanted to establish many Glacier2 sessions. I modified the subscriber from the <span style="font-family: Courier New">demo/IceStorm/clock</span> example to establish multiple sessions with the Glacier2 router, and I modified the publisher to publish a single event per second. (The publisher did not use Glacier2 but connected directly to IceStorm.) The published events were text strings that contained the current time.</p>
<p>Next I set up the following environment:</p>
<ul>
<li>Glacier2 hosted on a single-core 3.2GHz CentOS 4 machine with 2GB of memory.</li>
<li>IceStorm hosted on a CentOS 4 virtual machine, with the VM using one core of a Q6600 machine with 4GB of memory.</li>
<li>Publisher and subscriber hosted on a MacBook.</li>
<li>All machines were connected via a gigabit network.</li>
<li>All machines used Ice 3.3, compiled with optimization.</li>
</ul>
<p>After starting Glacier2 and IceStorm, I ran the subscriber application and waited for all subscriptions to complete. Initially, I configured the subscriber to establish 1,000 sessions with the Glacier2 router. After starting the publisher, I could see the events arrive at the subscriber. With 1,000 events per second flowing through the Glacier2 router, the system had no problems keeping up.</p>
<p>I gradually pumped up the volume by adding more and more subscribers until, eventually, I topped out at 8,000 subscribers, which corresponds to 8,000 events per second flowing through Glacier2. At that point, the Glacier2 host reached 100% CPU usage. Adding more subscribers confirmed that Glacier2 could no longer keep up with the flood of events; messages were gradually accumulating in the router, and subscribers were no longer receiving timely updates. On the other hand, with 8,000 subscribers, the IceStorm host used only 5% of the CPU, and the machine running the subscribers also showed very low CPU utilization.</p>
<p>Considering that the Glacier2 host (3.2GHz single core) is a slow and quite underpowered machine by modern-day standards, I think my testing shows that Glacier2&#8217;s current scalability is very good indeed. This is especially true given that Glacier2 is really only used for WAN-side applications. Although 8,000 events per second may not sound all that fast, consider the picture from a bandwidth point of view: even if these events are very modestly sized at 1KB each, a T3 connection (at 45 MB/s or 5760 KB/s) would be fully saturated with 5,760 events per second. In the end, I think you&#8217;ll find that Glacier2 will not be the bottleneck in your project.</p>
]]></content:encoded>
			<wfw:commentRss>http://zeroc.com/blogs/matthew/2008/07/31/glacier2-scalability-improvements-in-ice-33/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Integrating Ice with a GUI revisited</title>
		<link>http://zeroc.com/blogs/matthew/2008/07/28/integrating-ice-with-a-gui-revisted/</link>
		<comments>http://zeroc.com/blogs/matthew/2008/07/28/integrating-ice-with-a-gui-revisted/#comments</comments>
		<pubDate>Mon, 28 Jul 2008 16:57:21 +0000</pubDate>
		<dc:creator>matthew</dc:creator>
		
		<category><![CDATA[Ice]]></category>

		<category><![CDATA[gui ice asynchronous non-blocking]]></category>

		<guid isPermaLink="false">http://zeroc.com/blogs/matthew/?p=8</guid>
		<description><![CDATA[I wrote a series of Connections articles in 2006 that explored issues with using Ice in a graphical application. The first three articles concentrated on the GUI event loop, and specifically on strategies for issuing remote invocations without adversely affecting the user experience. The central problem was that all remote invocations had the potential to [...]]]></description>
			<content:encoded><![CDATA[<p>I wrote a series of <a href="http://www.zeroc.com/newsletter"><em>Connections</em></a> articles in 2006 that explored issues with using Ice in a graphical application. The first three articles concentrated on the GUI event loop, and specifically on strategies for issuing remote invocations without adversely affecting the user experience. The central problem was that all remote invocations had the potential to block, even if they were oneway or asynchronous.</p>
<p>In my first article, which appeared in <a href="http://www.zeroc.com/newsletter/issue12.pdf">Issue 12</a>, I introduced a call queue to avoid blocking the GUI thread. The idea was to write a class for each remote operation that the application needed to invoke; adding an instance of this class to the queue scheduled an invocation by a separate thread. The following code demonstrates the API:</p>
<pre>class Call : public Shared
{
public:
    virtual void execute() = 0;
};
typedef Handle&lt;Call&gt; CallPtr;

class SayHelloCall : public Call
{
public:
    SayHelloCall(const HelloPrx&amp; hello) :
        _hello(hello)
    {
    }

    void execute()
    {
        _hello-&gt;sayHello();
    }

private:
    const HelloPrx _hello;
};

CallQueuePtr queue = ...;
HelloPrx hello = ...;
queue-&gt;call(new SayHelloCall(hello));</pre>
<p>The call queue implementation was relatively simple: a dedicated thread executed the queued calls in order, essentially serializing all of the invocations. A side-effect of this implementation is that it guaranteed strict ordering of twoway requests in the same server.</p>
<p>I enhanced the call queue in <a href="http://www.zeroc.com/newsletter/issue13.pdf">Issue 13</a> to support multiple concurrent requests using two different implementations: one using asynchronous invocations and the other using a thread pool. Both strategies sacrificed the strict ordering guarantee in order to improve throughput. I also introduced the idea of using multiple call queues for messages with different qualities of service (such as oneway, twoway, and batched messages).</p>
<p>I presented an advanced technique in <a href="http://www.zeroc.com/newsletter/issue14.pdf">Issue 14</a> that used the Ice router facility to eliminate the need to define a <span style="font-size: 10pt;font-family: Courier New">Call</span> class for each remote operation. With this technique, invocations on Ice objects from the GUI are accomplished using the standard asynchronous API. The resulting code is considerably more straightforward:</p>
<pre>class AMI_Hello_sayHelloI : public AMI_Hello_sayHello
{
public:
    void ice_response()
    {
    }

    void ice_exception(const Ice::LocalException&amp;)
    {
    }
};

HelloPrx twoway = // Some twoway proxy.
twoway-&gt;sayHello_async(new AMI_Hello_sayHelloI());</pre>
<p>The disadvantage of this technique was that, along with the complexity of having to implement an Ice router, oneway invocations were not handled gracefully. Instead, they required the use of a special request context key, similar to the one used by the Glacier2 router.</p>
<p>We were not very happy with this situation, and we went about fixing it in the Ice 3.3 release. To ensure that asynchronous calls never block, Ice 3.3 introduced some major changes to the internal architecture, which Benoit and Mark detailed in their article &#8220;Background I/O&#8221; in <a href="http://www.zeroc.com/newsletter/issue28.pdf">Issue 28</a> of <a href="http://www.zeroc.com/newsletter"><em>Connections</em></a>. The Ice run time now maintains its own queue of outstanding calls. Unlike my call queue implementations, Ice&#8217;s queue is hidden from the application and much more efficient because calls only need be queued if they cannot be sent immediately.</p>
<p>Contrary to what you might expect, however, synchronous oneway invocations such as the one shown below may still block:</p>
<pre>HelloPrx twoway = // Some twoway proxy.
HelloPrx oneway = HelloPrx::uncheckedCast(twoway-&gt;ice_oneway());
oneway-&gt;sayHello();</pre>
<p>The call to <span style="font-size: 10pt;font-family: Courier New">sayHello</span> can block the calling thread during connection establishment, while performing DNS lookups, or because the server is slow or non-responsive.</p>
<p>Why did we decide to preserve the existing semantics of synchronous oneways? It certainly would be possible to make oneway invocations non-blocking, but that would introduce a big problem: how would we report errors back to the application? What should Ice do if, for example, a DNS lookup failed? Since the call to <span style="font-size: 10pt;font-family: Courier New">sayHello</span> would not be allowed to block, the DNS lookup would have to occur in a separate thread, so the call to <code>sayHello</code> could return. But once the call to <span style="font-size: 10pt;font-family: Courier New">sayHello</span> has returned, there is no obvious way to report the error to the application.</p>
<p>Ice 3.3 solved this problem in a different way by adding support for asynchronous oneways. These work much like asynchronous twoways in that you have to define an AMI callback object. The primary difference is that <span style="font-size: 10pt;font-family: Courier New">ice_response</span> is never called because oneway invocations do not have responses. If an error occurs while Ice attempts to send the message, the Ice run time invokes <span style="font-size: 10pt;font-family: Courier New">ice_exception</span> on the callback as usual. The code below shows how to make an asynchronous oneway request:</p>
<pre>class AMI_Hello_sayHelloI : public AMI_Hello_sayHello
{
public:
    void ice_response()
    {
        assert false;
    }

    void ice_exception(const Ice::LocalException&amp;)
    {
    }
};

HelloPrx twoway = // Some twoway proxy.
HelloPrx oneway = HelloPrx::uncheckedCast(twoway-&gt;ice_oneway());
oneway-&gt;sayHello_async(new AMI_Hello_sayHelloI());</pre>
<p>One of the first applications to take advantage of the new non-blocking oneway semantics was IceStorm. Much effort went into prior versions of IceStorm to ensure that invocations from publishers would never block, a non-trivial task given that there was no simple way to make non-blocking invocations. With the new capabilities in Ice 3.3, I immediately re-architected and simplified the IceStorm internals to take advantage of the new non-blocking semantics. However, once I started performance and stress testing, I noticed that there was a new issue: there was no flow control on the non-blocking invocations. The upshot of this was that flooding IceStorm with events caused the memory usage to climb very quickly. To solve this problem we added a callback that Ice invokes after a queued request is actually sent. This can be used to flow-control oneway events.</p>
<p>To receive this notification, an AMI callback object must implement the <span style="font-size: 10pt;font-family: Courier New">Ice::AMISentCallback</span> interface. Calling the asynchronous oneway method returns <span style="font-size: 10pt;font-family: Courier New">false</span> if the message was queued, or <span style="font-size: 10pt;font-family: Courier New">true</span> if the message was sent immediately. When the queued message is eventually sent, the Ice run time invokes the <span style="font-size: 10pt;font-family: Courier New">AMISentCallback::ice_sent</span> method on the callback object:</p>
<pre>class AMI_Hello_sayHelloI : public AMI_Hello_sayHello, public Ice::AMISentCallback
{
public:
    void ice_sent()
    {
        /* called when the message is sent*/
    }

    // ice_response, ice_exception implementation here
};

HelloPrx twoway = // Some twoway proxy.
HelloPrx oneway = HelloPrx::uncheckedCast(twoway-&gt;ice_oneway());

if(!oneway-&gt;sayHello_async(new AMI_Hello_sayHelloI()))
{
    // The request was queued, and ice_sent will be called
    // once the message is sent.
}</pre>
<p>The addition of the guaranteed non-blocking semantics renders much of the content of my first three articles obsolete. Developing GUI applications with Ice is now much more straightforward, and I can offer much simpler advice: all remote invocations made from the GUI should be asynchronous!</p>
<p>Earlier I mentioned the issue of strict ordering guarantees for twoway invocations. The call queue implementation in <a href="http://www.zeroc.com/newsletter/issue12.pdf">Issue 12</a> preserved the order of requests, whereas the subsequent strategies did not (and neither does the non-blocking asynchronous solution in Ice 3.3). If your application depends on the order in which requests are dispatched in the server, you must take additional measures such as disabling portions of the UI while a call is in progress.</p>
<p>Finally, since AMI callbacks are invoked by an Ice thread, it may not be safe for your callbacks to manipulate the GUI directly. This is the topic of my article in <a href="http://www.zeroc.com/newsletter/issue15.pdf">Issue 15</a>, which is still as relevant as ever.</p>
]]></content:encoded>
			<wfw:commentRss>http://zeroc.com/blogs/matthew/2008/07/28/integrating-ice-with-a-gui-revisted/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Regularity in Language Mappings</title>
		<link>http://zeroc.com/blogs/matthew/2008/07/25/regularity-in-language-mappings/</link>
		<comments>http://zeroc.com/blogs/matthew/2008/07/25/regularity-in-language-mappings/#comments</comments>
		<pubDate>Fri, 25 Jul 2008 16:37:23 +0000</pubDate>
		<dc:creator>matthew</dc:creator>
		
		<category><![CDATA[Language Mappings]]></category>

		<category><![CDATA[C++]]></category>

		<category><![CDATA[Google protocol buffers]]></category>

		<category><![CDATA[Ice]]></category>

		<category><![CDATA[java]]></category>

		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://zeroc.com/blogs/matthew/?p=7</guid>
		<description><![CDATA[The Slice language mappings are very natural. And they are very regular. What I mean by that is, if you know one language mapping, you know what to expect for another language mapping.
Does that mean things are always identical? No. There are necessary differences imposed by the target language. For example, for a Slice interface [...]]]></description>
			<content:encoded><![CDATA[<p>The Slice language mappings are very natural. And they are very regular. What I mean by that is, if you know one language mapping, you know what to expect for another language mapping.</p>
<p>Does that mean things are always identical? No. There are necessary differences imposed by the target language. For example, for a Slice interface Foo, in C++ you use <span style="font-size: 10pt;font-family: Courier New">FooPrx::checkedCast</span>, and in Java and C# you have to use <span style="font-size: 10pt;font-family: Courier New">FooPrxHelper.checkedCast</span>. There are also cases where we take advantage of a language-specific feature, such as C# delegates in the new <a href="http://www.zeroc.com/labs/silverlight/index.html">AMI Silverlight mapping</a>. These differences make a mapping more convenient to use, but less regular.</p>
<p>Our general rule is to strive to keep mappings as similar as possible without sacrificing convenience if a language feature would produce a more natural mapping.</p>
<p>Regularity aids in learning and comprehension. As modern-day programmers, we are expected to be multi-lingual. Anything Ice can do to aid comprehension is much appreciated! Unfortunately, the same cannot be said for the Google Protocol Buffers mappings.</p>
<p>The first thing you will notice when using PB is that the generated type name is different in each target language. This is not the case with any of the Slice language mappings, where the target language type name, by default, always matches the Slice type name.</p>
<pre><code>// Slice
module Demo
{
    class Hello
    {
        int id;
    };
}
</code></pre>
<p>The generated code for this example always produces a set of types in the <span style="font-size: 10pt;font-family: Courier New">Demo</span> namespace. For example, in C++, the types <span style="font-size: 10pt;font-family: Courier New">Demo::Hello</span> and <span style="font-size: 10pt;font-family: Courier New">Demo::HelloPrx</span> are emitted. With PB, the situation is different.</p>
<pre><code>// Person.proto
package tutorial;
option java_outer_classname = "PersonPB";
message Person
{
    required int32 id = 1;
}
</code></pre>
<p>As with the Slice language mappings, the C++ type generated by <span style="font-size: 10pt;font-family: Courier New">protoc</span> is <span style="font-size: 10pt;font-family: Courier New">tutorial::Person</span>. However, for Java, the type is <span style="font-size: 10pt;font-family: Courier New">tutorial.PersonPB.Person</span>, and for Python, it is <span style="font-size: 10pt;font-family: Courier New">Person_pb2.Person</span>.</p>
<p>Unlike the Slice-to-Java mapping, where the translator creates a package containing separate classes, the PB-to-Java mapping produces a single file that contains all definitions for a proto file. I don&#8217;t think a single file containing nested classes was a very good choice. For the application programmer, this causes rather lengthy and obfuscated class names. It also forces the addition of the <span style="font-size: 10pt;font-family: Courier New">option java_outer_classname</span> construct. Why? By default the class file that is emitted by the <span style="font-size: 10pt;font-family: Courier New">protoc</span> compiler has the same name as the proto file. In the example above, a single file <span style="font-size: 10pt;font-family: Courier New">tutorial/Person.Java</span> would be generated. However, this file wouldn&#8217;t compile since Java does not allow a nested class to have the same name as an outer class. Therefore, by necessity the PB developers added the <span style="font-size: 10pt;font-family: Courier New">java_outer_classname</span> option to force the <span style="font-size: 10pt;font-family: Courier New">protoc</span> compiler to produce a different file name.</p>
<p>The Slice-to-Python mapping, again, produces a package containing all of the necessary classes. The PB-to-Python mapping produces a single file. For example, <span style="font-size: 10pt;font-family: Courier New">Person.proto</span> emits a single file <span style="font-size: 10pt;font-family: Courier New">Person_pb2.py</span>. The package specified by the protocol definition has no effect on the generated package name. Personally, I find this very unexpected, and not at all desirable. For example, consider a change to the name of the protocol file. Now, at the very least, all of the import statements in your code must change. Far from ideal!</p>
<p>With the Slice language mappings, identifiers and methods contained within an interface, class or struct are always the same. For the example above, a class <span style="font-size: 10pt;font-family: Courier New">Hello</span> is emitted containing the member variable <span style="font-size: 10pt;font-family: Courier New">id</span>. If you know one Slice mapping, you know them all.</p>
<p>For the PB language mappings, the identifiers emitted by the various mappings are, surprisingly, different for each language. In C++, the variable <span style="font-size: 10pt;font-family: Courier New">id</span> maps to the member functions <span style="font-size: 10pt;font-family: Courier New">Person::has_id</span>, <span style="font-size: 10pt;font-family: Courier New">Person::get_id</span>, and so on. In Java, the methods are named <span style="font-size: 10pt;font-family: Courier New">Person.hasId</span> and <span style="font-size: 10pt;font-family: Courier New">Person.getId</span>, and in Python the mapping uses <span style="font-size: 10pt;font-family: Courier New">Person.id</span> and <span style="font-size: 10pt;font-family: Courier New">Person.HasField(&#8221;id&#8221;)</span>. Each of these is different, and worse, for no obvious reason. Once again, knowledge of one mapping does not impart knowledge of any other.</p>
<p>When it comes to creating instances of the various classes, the situation is, again, different in each language. For C++ and Python, you create an instance of the class and populate the members, but Java has a totally different mechanism. The Java mapping uses a factory pattern (called a Builder), where you construct a Builder and then use it to create an immutable message instance. For example:</p>
<pre><code>tutorial.PersonPB.Person p = tutorial.PersonPB.Person.newBuilder().
    setId(1).build();
</code></pre>
<p>I&#8217;m not saying that the builders are bad—they are not. Because of builders, it is impossible to create an un-initialized message instance, which is a good thing. However, why the different paradigms? What is good in Java would also be good in C++ and Python.</p>
<p>As you can see, the Slice language mappings are regular, and as close to each other as possible. This wasn&#8217;t an accident. We are all very aware, as daily users of our own product, that this is important. It is a pity that the PB mappings don&#8217;t take the same approach.</p>
]]></content:encoded>
			<wfw:commentRss>http://zeroc.com/blogs/matthew/2008/07/25/regularity-in-language-mappings/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Google Protocol Buffers Integration</title>
		<link>http://zeroc.com/blogs/matthew/2008/07/19/google-protocol-buffers-integration/</link>
		<comments>http://zeroc.com/blogs/matthew/2008/07/19/google-protocol-buffers-integration/#comments</comments>
		<pubDate>Sat, 19 Jul 2008 10:56:05 +0000</pubDate>
		<dc:creator>matthew</dc:creator>
		
		<category><![CDATA[Google protocol buffers]]></category>

		<category><![CDATA[Ice]]></category>

		<category><![CDATA[binary protocols]]></category>

		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://zeroc.com/blogs/matthew/?p=3</guid>
		<description><![CDATA[I was very interested last week to see the release of Google Protocol Buffers. Any contribution to the open source community should be congratulated!
The rapid demise of XML as a data store for large amounts of non-human readable structured data, and for RPC (such as SOAP), comes as no surprise to me. XML, although touted [...]]]></description>
			<content:encoded><![CDATA[<p>I was very interested last week to see the release of <a href="http://code.google.com/apis/protocolbuffers">Google Protocol Buffers</a>. Any contribution to the open source community should be congratulated!</p>
<p>The rapid demise of XML as a data store for large amounts of non-human readable structured data, and for RPC (such as SOAP), comes as no surprise to me. XML, although touted as such, is clearly not human readable. On top of that it is both bandwidth and storage intensive, and requires huge amounts of CPU cycles to process. Even in this day and age of fast CPUs, terabytes of storage and gigabit networks, every byte and cycle still counts, especially for huge companies like Google with vast amounts of data. Bandwidth is not free, and neither are cycles (someone has to pay that power bill!).</p>
<p>Now we&#8217;ve reached a new milestone. Web services are quickly dying a well-deserved death. WSDL, despite arguments to the contrary, is nothing new (Michi has written about this in the past, see <a href="http://zeroc.com/newsletter/issue2.pdf">&#8220;To Slice or not to Slice&#8221;</a> for details). It is nothing more than an exceptionally convoluted, and unreadable, form of an interface definition language. SOAP, WSDL&#8217;s partner in crime, is nothing but hype. Finally, the toilet of history has been flushed, and we can watch the last vestiges of the colossal mistake that is web services swirl down the drain.</p>
<p>The lessons of the past are finally being relearned. Slow, fat and bloated is out. Sleek, small and fast is in. Binary encodings and protocols are undergoing a renaissance. All of this is good news for ZeroC and Ice. We understand speed. We live for simplicity.</p>
<p>So where does Google protocol buffers fit in? Other than agreeing with my general world view that binary encodings for non-human readable data are a good thing, Google protocol buffers are only part of a puzzle for many applications. What is missing is a facility to pass a protocol buffer object over the wire. It didn’t take long for some developers to <a href="http://groups.google.com/group/protobuf/browse_thread/thread/73e6cabaaffb9760/9058d1f2ab1ca1db?lnk=gst&amp;q=%3D%3D#9058d1f2ab1ca1db">discuss starting such a project</a>. It also didn’t take long for Blair Zajac to point out <a href="http://groups.google.com/group/protobuf/browse_thread/thread/73e6cabaaffb9760/9058d1f2ab1ca1db?lnk=gst&amp;q=blair#9058d1f2ab1ca1db">here</a> and <a href="http://groups.google.com/group/protobuf/browse_thread/thread/2b14dd79aa6d0e4c/278a8d0906ea823f?lnk=gst&amp;q=blair#278a8d0906ea823f">here</a> that Ice would be a good companion.</p>
<p>I shouldn’t have to go on about why using Ice for an RPC mechanism is a good thing. I could trot out all the standard arguments such as speed, flexibility, security, support for both synchronous and asynchronous operations, firewall traversal, quality <a href="http://www.zeroc.com/doc/">documentation</a>, and so on and so forth. But I won’t <img src='http://zeroc.com/blogs/matthew/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>Using Ice, it is already pretty easy to pass any data that can be encoded to a sequence of bytes, such as the Google protocol buffers.</p>
<p>Lets take a quick look at what this would look like in python.</p>
<p><code>// .proto<br />
package tutorial;<br />
message Person {<br />
required int32 id = 1;<br />
required string name = 2;<br />
optional string email = 3;<br />
}<br />
// Slice<br />
module Demo<br />
{<br />
sequence&lt;byte&gt; Person;<br />
interface Hello<br />
{<br />
void sayHello(Person p);<br />
};<br />
};<br />
</code></p>
<p>In the client:</p>
<p><code><br />
# python<br />
hello = Demo.HelloPrx….<br />
p = Person_pb2.Person()<br />
# Fill in details<br />
hello.sayHello(p.SerializeToString());<br />
</code></p>
<p>In the server:</p>
<p><code># python<br />
class HelloI(Demo.Hello):<br />
def sayHello(self, s, current=None):<br />
p = Person_pb2.Person()<br />
p.ParseFromString(s)<br />
print "Hello World from %s" % str(p)<br />
</code></p>
<p>Not very complicated, but can we do better? What would be really cool, is if you could pass a Person object as a method parameter, and it would magically appear in the server as a Person object. In other words, automate all that busy work code. Now that would be great!</p>
<p>That is exactly what I&#8217;ve spent the past few days doing, and the result is our latest <a href="http://www.zeroc.com/labs">ZeroC Labs</a> project, which you can read about <a href="http://www.zeroc.com/labs/protobuf">here</a>. As always, source code is readily <a href="http://www.zeroc.com/labs/protobuf/download.html">available</a>.</p>
<p>We’re releasing this labs project as an experiment, to gauge interest. If the community finds it useful, we’ll integrate this more fully into a future release of Ice. It may even be possible to not only support the Google protocol buffers encoding, but also other encodings such as C# and Java serialized types.</p>
<p>Please look over what this integration has to offer, and give us any feedback you might have!</p>
<p>Have fun with Ice, Matthew</p>
]]></content:encoded>
			<wfw:commentRss>http://zeroc.com/blogs/matthew/2008/07/19/google-protocol-buffers-integration/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
