Blog

This is where I keep my blog. You can keep up with various projects movement and my general thoughts on software development and technology. If you're a fan of RSS, you can point your favourite RSS client to the sites RSS feed.

David Ryan (aka Oobles)
david at livemedia dot com dot au

Binary Object REst Distributed (BORED) system - Part 7 - Uniform Interface Constraint

Its been a couple of weeks since the last BORED post. Something has gone wrong with my very old Thinkpad. It now only boots in safemode with 640x480 resolution; not the nicest environment to work. I'm holding out for another few weeks before hopefully getting hold of a new Mac. I'm still not sure if it will be an air, book, pro. The posts here will probably slow down until they're out.

For now, back to BORED. Today's post is probably the most interesting of the posts and highlights the real issue that BORED, Argot and every other protocol is really trying to solve. That is the movement of information and knowledge between client and server. This is very different to the simple task of moving data (ie bits and bytes). The problem of moving knowledge between applications is the central aspects of what draws me to this otherwise dull area of computer science.

I look at the current browsers, programming languages and enterprise systems and see a single underlying problem; we have very little understanding of how to move knowledge between systems. Solving this problem can lead to more fluidity of data with less work by programmers. This should also lead to better useability for the applications we build. There's a lot of work to do and probably a few books to be written in this area before it will be solved. BORED is an excersize in breaking out of the mold and seeing if a better approach can be found. Without further ado, lets get back to BORED!

The BORED protocol has now been tested against some of the challenging REST constraints. The next and probably the most difficult constraint to be tested is the Uniform Interface Constraint. This is the point where the request message data structures hit the target object and the mismatch between a hypermedia system and other types of interactions with servers is most obvious. As the aim of the BORED protocol is to bring find some alignment between REST and Object orientated systems, this is where things should get interesting.

Uniform Interface
The Uniform Interface constraint is one of the more interesting constraints of REST. It reduces all operations to a small set of file like operations, e.g. GET, POST, PUT, DELETE, HEAD, etc. In the case of BORED, however, I'm trying to bring together the concept of an Object Orientated system with that of a Hypermedia system in a sensible way. At this point it is a good time to review the BORED architectural model:

client --[request]-->Server-->Container-->Object Receiver|Object

client<--[response]--Server<--Container<--Object Receiver|Object

The BORED Remote Message Call(RMC) model encompasses all interface request data into the message data portion of the request. This is delivered to the Object Receiver, which uses this information to interact with the target Object. These interactions could involve any one of the following:

Object Receiver -------> Document/File

Object Receiver -------> Object Instance with public methods

Object Receiver -------> Data Collection

Object Receiver -------> Proxy Interface

Object Receiver -------> Etc...

It is also worth reviewing what Fielding has to say about the Uniform Interface Constraint:

"The central feature that distinguishes the REST architectural style from other network based styles is its emphasis on a uniform interface between components (Figure 5-6). By applying the software engineering principle of generality to the component interface, the overall system architecture is simplified and the visibility of interactions is improved. Implementations are decoupled from the services they provide, which encourages independent evolvability. The trade-off, though, is that a uniform interface degrades efficiency, since information is transferred in a standardized form rather than one which is specific to an application?s needs. The REST interface is designed to be efficient for large grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction."

As stated, it is the Uniform Interface constraint that really sets the REST approach apart from many other systems. It is the simplicity of the uniform interface that makes the interactions between browser and web server so powerful.

Fielding continues with:

"In order to obtain a uniform interface, multiple architectural constraints are needed to guide the behaviour of components. REST is defined by four interface constraints: identification of resources; manipulation of resources through representations; self descriptive messages; and, hypermedia as the engine of application state. These constraints will be discussed in Section 5.2."

The Uniform Interface constraint therefore has multiple sub-constraints. Any diversion from these constraints will cause BORED to diverge from the REST approach. However, Fielding also states that the Uniform Interface constraint is a trade-off between degrading efficiency and providing an:

"efficient interface for large grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction."

This trade-off is clearly shown in AJAX based applications. Application designers are forced to use the REST approach for all aspects of the client-server interactions. An AJAX based application downloads Javascript which often makes remote calls back to the web server. These AJAX calls are better suited to a solution which allows a program centric interaction with the server (note I'm being careful not to use the term RPC). The client may be attempting to return document fragments or even simple single string responses. In these situations the uniform interface constraint creates additional work for the developer and designer. These AJAX/Web 2.0 interactions would benefit from a stronger binding between client and server.

The AJAX/Web 2.0 example shows the trade-off that Fielding discusses in regards to REST. However, the trade-off has obviously served the Web Hypermedia system well to this point. Take for example the simplicity of:

http://www.livemedia.com.au/my_image.jpg

By entering a URL into a browser we imply the GET request, and the image is retrieved. The web's power is driven through this simplicity.

At this point it is worth doing a small detour into the realm of data contracts. Understanding the different types of data contracts that client/server systems use will provide a better set of tests to base the BORED protocol.

Data Contracts
The topic of data contracts is probably the most interesting aspect of distributed computing. This is where there is some agreement between client and server that after sending a specific set of data to a location will result in an agreed set of other data being returned. The contract can range from being implied, to being rigidly defined using procedure call semantics (as is the case in CORBA IDL). This philosophy around the area of data contracts changes with each new technology and fad.

The reason for this constant flux is that what is required changes for different purposes. If a user is involved then human cognition is the most important part of the contract. If the communication is purely between code on both client and server then as long as the client matches the server the contract can be implied. If the clients are many and varied and are using a 3rd party service then consistancy and an Interface Definition Language is desierable. If the client wishes to discover new interfaces then discoverability and associated IDL is a requirement. Finally, in some cases an IDL does not go far enough and a full and independent textual specification (eg RFC) is required.

Each of the methods of creating data contracts implies different requirements for the BORED protocol. The following is a simple breakdown of different contracts and some implications for the BORED protocol. There has probably been better and more thorough analysis of data contracts been done before; if you're aware of any, please let me know via comments.

Human Cognition Data Contract
The URL is probably the best example of providing human cognition to a data contract. By reading a URL a user is able to have a fairly good idea of what information will be returned. There is obviously skill in defining a good URL structure for any web site. However, the current URL also include request parameters which can modify the result of a particular page.

Take for example the following hyperthetical request:

http://www.livemedia.com.au/store_search.x?s=books&author=ryan&page=2

Binary Object REst Distributed System - Part 6 - Cache Constraint

The BORED protocol already meets the first two constraints of REST; client-server and stateless. We've also extended the client-server constraint to allow asynchronous client-server. The next REST constraint to meet is the Cache constraint.

Cache Returning to Fielding's REST dissertation, we find:

"Cache constraints require that the data within a response to a request be implicitly or explicitly labelled as cacheable or non-cacheable. If a response is cacheable, then a client cache is given the right to reuse that response data for later, equivalent requests."

In the BORED protocol there's an additional requirement to this, which relates to the stateless requirement. To label a response as cacheable or non-cacheable requires that the request is uniquely identifiable. In BORED, the stateless request data is broken into two parts; the location and the message data. To satisfy this constraint a proxy server or client must identify the location and the request data as a single object and match this against the response data. As the request message data is binary the simplest solution is for a client or proxy server to keep a hash on the message data and location. To improve performance this hash value could be added to the request data to provide a key to a cache that will lower its overhead to calculate the key. It's important to add that the hash should only be based on the message data. This allows proxies to perform operations such as rerouteing of messages to new locations without needing to update the hash value.

To support the response aspect of the cache requirement, BORED includes cache information in the response header:

preamble - BORED
version
dictionary parts
available request slots
request identifier

response code
cache information

In the REST mismatches with HTTP Fielding writes:

"Differentiating Non-authoritative Responses
One weakness that still exists in HTTP is that there is no consistent mechanism for differentiating between authoritative responses, which are generated by the origin server in response to the current request, and non-authoritative responses that are obtained from an intermediary or cache without accessing the origin server. The distinction can be important for applications that require authoritative responses, such as the safety-critical information appliances used within the health industry, and for those times when an error response is returned and the client is left wondering whether the error was due to the origin or to some intermediary. Attempts to solve this using additional status codes did not succeed, since the authoritative nature is usually orthogonal to the response status.

HTTP/1.1 did add a mechanism to control cache behaviour such that the desire for an authoritative response can be indicated. The ?no-cache? directive on a request message requires any cache to forward the request toward the origin server even if it has a cached copy of what is being requested. This allows a client to refresh a cached copy, which is known to be corrupted or stale. However, using this field on a regular basis interferes with the performance benefits of caching. A more general solution would be to require that responses be marked as non-authoritative whenever an action does not result in contacting the origin server. A Warning response header field was defined in HTTP/1.1 for this purpose (and others), but it has not been widely implemented in practice."

When the request message headers are developed in detail it will be important to include the ability to define a 'no-cache' directive. The cache information returned in the response should also indicate if the response is non-authoritative.

Location only constraint
At this point we add another new constraint to the system; the location only constraint. The location in each request should only include the location specific information. Request parameters must only be supplied in the message data. This constraint is designed to ensure the separation of the message data from the location data. This allows fast and easier routing of message data.

This constraint is a direct opposite of a common practise of encoding request parameters on to URI's in HTTP. For example:

http://www.livemedia.com.au/bookstore?author=ryan&page=1&list=10

In the BORED protocol the location must be separate from the message data.

(location bored://www.livemedia.com.au/bookstore) (message author=ryan@page=1&list=10)

This constraint is designed to combine with the cache constraint to ensure message parameters are not confused with location data in cache systems. It also ensures that the required meta data to decode the message is included in the message meta data.

It is interesting to note that the cache constraint requires the stateless constraint to function. A cache must be able to deal with a whole message uniquely to operate correctly.

Binary Object REst Distributed System - Part 5 - Stateless constraint

The initial constraints of REST were easy to provide a solution in BORED. In this post we tackle the stateless constraint. As BORED uses binary message data this constraint creates some unique challenges.

Stateless

The Stateless requirement is REST's second constraint. Fielding writes:

"We next add a constraint to the client-server interaction: communication must be stateless in nature, as in the client-stateless-server (CSS) style of Section 3.4.3 (Figure 5-3), such that each request from client to server must contain all of the information necessary to understand the request, and cannot take advantage of any stored context on the server. Session state is therefore kept entirely on the client."

To see the stateless requirement more clearly I'll review HTTP. Here's an example of a HTTP 1.1 request and response.

GET http://www.eienet.com.au/ HTTP 1.1
...

200 OK
...

The request encodes the full description of what the client is requesting in the URI and HTTP GET verb. To align with REST, BORED requires a similar location specifier. Let's assume a URI for now, however, to support embedded devices this will need to be more flexible.

To satisfy the stateless constraint, the following parts of BORED are required in the request:

prefix - BORED
version
request identifier
location - URI location or other location type.
....
message
-- message meta data.
-- message - request data.
---- operation - GET,META,POST,METHOD,etc
---- message data

To meet the stateless requirement the BORED protocol includes the location and full request data.

In the case of a binary protocol an interesting addition is the inclusion of "message meta data". This is Argot specific however can be extended to any binary system that has a meta data definition. In the Argot case the meta data specifies the data structures of the data in the message.

The ?message meta data? describes the message data, however, at this point there's no meta data to describe the actual request structure. To understand how BORED will solve this it is worth introducing the concept of an Argot Message Format. The Argot Message Format is designed to be completely self defining. Here's a short description from the Argot Programmer's guide.


Argot Message Files & Dictionaries

Argot message files are binary encoded files that provide the specification of their data with the data. An Argot file contains three parts; a meta dictionary, a data dictionary and the data.

The Argot Message Format allows the full specification of the data to be transferred with the data. This requires no external definition of the data. For an application to be able to read the file its type library must contain all the data types used in the file. A Type Map is generated from the data dictionary portion of the file to read the data. The general format of the file is:

The receiver of an Argot enabled file is able to read the dictionary and compare the data types of its own dictionary with that of the files. Once the types of the file dictionary have been matched with that of the application reading the file, the data can be read. This completely removes the need for a static common domain schema. Each application and file in effect contains its own schema.

This can be re-illustrated using the following venn diagram:

The process of reading a file involves:

  1. Binary compare of meta dictionary map. The very first dictionary map of the meta dictionary is the core met dictionary. The only way to read this entry is by performing a binary compare. These are the base dictionary items used to describe new items. Please refer to the meta dictionary reference section for details of the core meta dictionary.
  2. Build and read Meta dictionary. The rest of the meta dictionary is read and mapped between the application and file.
  3. Read the Data dictionary. Using the Type Map produced from entries in the Meta Dictionary the Data dictionary is read. A Data dictionary type map is created based on the types identified.
  4. Read the Data. Using the Data dictionary type map the actual data of the file is read.

The argot message format can be used anywhere that a data buffer can be transferred. In files, message oriented middleware, email, etc.


It would be easy to simply use the Argot Message Format as the full request structure to be delivered to the server. However, carrying the 'meta dictionary' with each and every request adds a lot of overhead. This would also hide the contents of the request data requiring a cache/proxy to read the meta dictionary, data dictionary and data before it can understand the request.

The solution used in BORED is to use the version information of the protocol as a monica for a data dictionary. When a server receives a request it uses the BORED protocol version to choose the corresponding data dictionary. This is like having the meta dictionary and data dictionary of the request at the start of every request. The request and response BORED message are themselves specified in this data dictionary.

The BORED request message however also requires a meta data section for times when the meta data for the request does not include data required by the object receiver. The message data dictionary expands on the request data dictionary to include elements required by the message.

Logically this looks as follows:

[ meta dictionary ] [ request data dictionary ]
---- [ request ... [ message [ message data dictionary] [data ]] ... ]

This allows the Request to logically contain the full meta dictionary, data dictionary, and data for the full BORED request in every message without the overhead of the full meta dictionary and data dictionary.

Using the above method has a drawback that the "request data dictionary" must define every aspect of the request message structure. This includes, security, cache information formats, header formats, and others. This creates an issue for very small devices that only support a subset of the request headers. A solution to this is to break the "request data dictionary" into parts. The client and server can then identify in their request and response the parts of the request data dictionary it supports. For simplicity the parts supported can be indicated via a bit-flag in the version part of the header. For instance, the version header could use three 8-bit flags. The first two would be the major and minor version with the third being the bit-flag for the parts of the request data dictionary supported.

Building on the last post, the request structure header now looks like for the request:

prefix ? BORED
version
dictionary parts
request identifier
...

and response:

prefix - BORED
version
dictionary parts
available request slots
request identifier
...

Delivering the stateless constraint using a binary protocol has required developing a few tricks. In particular using the request version number as a key to meta dictionary and request data dictionary has allowed the solution to deliver a technically correct construct and still delivered the ability to reduce the amount of network traffic for each request/response. Using the bit-flag for specifying the parts request data dictionary supported has also allows the solution to scale from small devices to large full features systems.

Also posted here

Binary Object REst Distributed (BORED) system - Part 4 - Constraints & Assumptions

At this point I've introduced the BORED idea, the blueprint and provided a rough 0.1 version of the protocol. The next step is to test the message structure against various constraints & assumptions of REST.

The first constraints and assumptions to be tested are based on the constraints defined by REST. They set the ground work for the protocol and provide the constraints required to define the request/response headers.

Lossless Communication Stream
The very first assumption is that the solution will operate on a lossless bi-directional communication channel that supports streams (i.e. TCP). This assumes the transport will take care of the connection set-up and tear down. The transport will ensure that the data is received in order and provides a byte stream interface. This is a rather obvious assumption to make, however, it is important to get the basics right.

For embedded devices we will assume that if it doesn't support TCP, then another transport protocol will be provided. If the messages are small enough the protocol should also operate on UDP style network protocol. The protocol may also operate on an asynchronous transport such as message queuing and email systems.

Client-Server
The second part of the requirements is that of client-server. This is REST's first requirement. Fielding describes client-server as:

"The client-server style is the most frequently encountered of the architectural styles for network-based applications. A server component, offering a set of services, listens for requests upon those services. A client component, desiring that a service be performed, sends a request to the server via a connector. The server either rejects or performs the request and sends a response back to the client."

The client-server style requires that request data is sent to the server and it responds with response data. The initial definition of the protocol's request and response data is the following. request:

preamble - BORED
version
...
;

The response structure is the same:

prefix - BORED
version
...
;

The request and response headers look the same. It contains a preamble that notifies the receiver that the message is using the BORED protocol. The preamble also provides a point where if the receiver is out of sync with the send, it provides a point where the start of the next message can be found. The client sets the version of the protocol. The server sets the version to the version it is currently using. The server must not respond with a version that is greater than the client.

Asynchronous Client-Server
One of the interesting parts of Fielding's dissertation is the REST mismatch with HTTP. Fielding states:

"HTTP/1.1, though defined to be independent of the transport protocol, still assumes that communication takes place on a synchronous transport. It could easily be extended to work on an asynchronous transport, such as e-mail, through the addition of a request identifier. Such an extension would be useful for agents in a broadcast or multicast situation, where responses might be received on a channel different from that of the request. Also, in a situation where many requests are pending, it would allow the server to choose the order in which responses are transferred, such that smaller or more significant responses are sent first."

To support asynchronous requests, a request identifier needs to be added to the request and response data structures. i.e. The request:

prefix - BORED
version
request identifier
...
;

and response:

prefix - BORED
version
request identifier
...
;

The request identifier is set by the client. The server must respond with the same request identifier in the response. This allows a client and server to use a single channel and interleave requests and responses. This improves the channel usage and reduces latency which leads to a better user experience. Using a single channel for multiple requests also aligns well with the direction of CPUs containing many cores. Many threads can be assigned to a single channel.

The response can come from either a cache, server proxy, or server containing the object. The important thing is that by introducing a request identifier the protocol no longer needs to conform strictly to synchronous request/response semantics.

Specifying a "request identifier" is a rather simplistic approach to allowing asynchronous request/response message processing. One problem with this approach is that the server has no way of letting the client know how many messages it is able to process at one time. A possible solution to this would be for the server to response with how many message slots it has available. ie response:

prefix - BORED
version
available request slots
request identifier

For a server with constrained resources the request slots value may always be 1. Using the response message to provide the number of request slots requires that the client receive at least one response before it can know how many requests it can send. A simple solution to this would be that the server notifies the client upon initial request. This will need to be explored further in the future.

The other feature suggested by Fielding is that an asynchronous request could use different channels for receipt of the request. To allow this, additional optional headers could be provided to specify a "return address" and "time to live". The "time to live" allows the client to specify how long it is willing to wait for a response. If the server is unable to provide a response before the given time it should drop the request and not deliver the response. This type of feature is added to the protocol via the optional headers because it likely to be used rarely.

Introducing the concept of asynchronous requests and responses introduces a number of new challenges that must be explored. The proof of how well each of these ideas will work in BORED will be explored when implementing the protocol.

Binary Object REst Distributed (BORED) system - Part 3 - Structure

In this post I present version 0.1 of the BORED request and response structures. This will be refined in future posts as trade-offs are made and the protocol matures. I won't go into much detail here as future posts will provide a lot more detailed analysis.

There are only two message types in the BORED protocol; the request type and response type. Operations such as GET or POST found in HTTP are encapsulated in the message data and do not form the surrounding message. The request message consists of:

request:

preamble - BORED version

location - URI location or other location type.

optional headers
- headers meta data
- headers data

message
- message meta data.
- message - request data.
- - operation - GET,META,POST,METHOD,etc
- - message data

optional security
- identity/signature meta data
- - optional identity
- - optional signature

The request data elements are:

preamble - "BORED" The six ASCII characters define the headers of the BORED protocol. This signifies the start of the message.
version - This will consist of a major and minor version as two unsigned 8-bit characters.
location - This is the location where the message should be delivered.
optional headers - This provides an area for additional information to be added. It is analogous to HTTP headers.
message - The message to be delivered to the specified location. This should include any POST data or URI request parameters found in HTTP.
optional security - This provides the option to provide identify of the client and sign requests.

A simple request message might look logically like:

BORED 0.1 - BORED://www.livemedia.com.au/document.pdf (GET) - ;

Note: For now these requests are demonstrated as text for human readability. As the elements of the protocol are refined binary examples will be provided.

In the example above the following elements are present:

preamble - BORED version - 0.1
location - BORED://www.livemedia.com.au/document.pdf
optional headers - not included
message - (GET)
optional security - not included

The response has much of the same information as the request data. The response includes a response code and caching information.

response:

preamble - BORED
version

response code
cache information

optional headers
- headers meta data
- headers data

message
- message meta data.
- message - response data.
- - message data

optional security
- optional identity/signature meta data
- optional identity
- optional signature (from response code)

The response fields include many of the same data as in the request.

preamble - "BORED" The six ASCII characters define the headers of the BORED protocol. This signifies the start of the message.
version - This will consist of a major and minor version as two unsigned 8-bit characters.
response code - The response code for the data.
cache information - Information on if the response should be cached and for how long.
optional headers - This provides an area for additional information to be added. It is analogous to HTTP headers.
message - The message to be returned to the client.
optional security - This provides the option to provide identify of the server and sign responses.

A corresponding response might logically be:

BORED 0.1 200(OK) "No Cache" - (mime document/pdf .....data....) - ;

In this case the elements are:

preamble - BORED
version - 0.1
response code - 200(OK)
cache information - "No Cache"
optional headers - not included
message - (mime document/pdf .....data....)
optional security - not included

Tim Bray mentioned in this blog that it would be likely that any development of a protocol will probably end up looking a lot like HTTP. I think his spot on! You will find many of the same elements in different protocols. However, some of the nuances between each protocol can have a big effect on a protocols design and flexibility. For example, by moving the GET verb from the request header and into the body, we've completely changed the character of the protocol. HTTP is purposely constrained to a reduced set of verbs such as GET, PUT and POST. However, BORED places the verb into the body of the message which allows any number of verbs to be implemented without disturbing the transfer portion of the protocol.

Refering back to the layers of the protocol, we can see that most of the protocol is concerned with the transfer layer. The message/presentation layer is the message structure and its structure can be defined without concern for the transfer layer. The object receiver is concerned with how to process the message content and is outside the scope of the actual protocol. It is only important that any type of data can be transferred to the Object Receiver in the message body.

The layered design should allow the message data to arrive at the Object Receiver using different methods (protocols or in other data structures). This becomes important when providing a layered system. A front-end server (e.g. apache) may receive the message and then use a different transfer protocol to pass the message to an internal system. Using this method, the Object Receiver may receive additional information regarding the request; this would be dependent on the features of the internal system. This design ensures that the message can be separated from the transfer protocol in a simple way without requiring processing the data contained inside the message.

The import elements in the request which are required for the transfer layer include:

Preamble and Version - This simply sets the receiver of the message to understand and sync with the right version of the protocol.
Location - The location provides the target for the message.
Optional Headers - This can include information for Proxy servers, or request that the request is not responded to by caches.
Optional Security - This can be used for signing the message request data.

The message structure is the payload of the transfer protocol. The message layer requires a separate investigation and will be developed further later in this series of posts.

This post provides the rough outline of the data to be included in requests. The message structure and meta data associated with request and response will be developed in future posts. The next steps will be to test the protocol design against REST constraints and see what other features may be useful.

Binary Object REst Distributed (BORED) system - Part 2 - Blueprint

The discussions on Steve vinoski's blog regarding REST, RPC, ORBs, etc highlighted that there's usually a very high level model that a distributed system is built upon. In the case of RPC it is the notion that a procedure call can be made to look local. In REST/Hypermedia it is a distributed document model with loose coupling through hyperlinks. Finally, in the case of ORBs it is an Object Request Broker; an object based remote procedure call. In all these cases the model helps define many of the constraints of the system. These restrains often permeate into every aspect of the system.

The BORED system has the rather interesting task of trying to combine the REST/Hypermedia constraints with that of an object orientated system. To do this, a high level model needs to be defined to use as the blueprint.

The BORED blueprint in this case is quite simple:

client --[request]--> Server --> Container --> Object Receiver | Object client <--[response] -- Server <-- Container <-- Object Receiver | Object

The idea behind BORED is that a message is being delivered directly to an Object Receiver via a server and container. The message can be any data. It is up to the Object Receiver to decide how to process the message received. The Container and Server are there as conduits for the message to be delivered, however, they do not directly respond to the message. The conduits can add security constraints on who can interact with the target object and manage the life cycle of creating and destroying the target object. The container itself could also be the Object Receiver, however, to keep the model simple these types of adaptations won't be discussed.

The Object Receiver is able to respond directly to the message by returning the object data (as in a document or image). This should allow a hypermedia solution to be developed that has simple file based Object Receiver. Alternatively, the Object Receiver may process the message and call a method as is done in a traditional RPC or ORB. Interactions could involve any one of the following:

Object Receiver -------> Document/File Object Receiver -------> Object Instance with public methods Object Receiver -------> Data Collection Object Receiver -------> Proxy Interface Object Receiver -------> Etc...

An important note here regarding RPC and BORED. BORED is designed to support a RPC mechanism, however, it is not locked into a single mechanism. Different types of skeletons could be built into the Object Receiver. The initial mechanism will likely use a Remote Message Call mechanism, however, it is up to the Object Receiver to define the meta data and interfaces associated with it.

I have used the name ?Remote Message Call? to describe the BORED call method; this is to separate it from the traditional Remote Procedure Call (RPC). A Remote Procedure Call is the language centric view that maps a set of parameters of a method on a server to an equivalent local call on a client. This is a simplistic view of RPC. In BORED there is a message centric view of RPC. That is, the remote call is defined by the data contained in the request object and the data returned in the response. The message data in the request and response forms the contract between client and server. This message data in the request or response can be bound to a language based method call on the client and server, however, it is not a requirement. As BORED is based on describing the request/response data, not the remote method call this is not a traditional RPC mechanism. The distinction is important and ensures that some of the issues of RPC do not get embedded into BORED.

Interestingly, the BORED/RMC model reflects message queuing semantics more than it does RPC, REST or ORB semantics. The fundamental idea is that the data is contained in an envelope and delivered directly to an end point. The difference is that the BORED model is designed for synchronous request/reply semantics, where as message queuing is uni-directional.

As an example of how BORED semantics differ from HTTP, we can look at a HTTP GET request. The HTTP protocol uses the verb GET before specifying the location of the document to be retrieved. This creates a model where the server is performing the GET operation on the document requested. The BORED system will include a GET verb inside the message and deliver it to the Object Receiver. By moving the GET verb into the message it is the ?Object Receiver? processing the verb instead of the server. The intention of this is to localise the requested data to the object to which it is being delivered.

One thing I should point out at this point; the model is already making trade-offs. The most obvious to the REST aware folk will be that by encapsulating the message in an envelope and by allowing language orientated mechanisms in the object receiver, BORED removes the Uniform Interface constraint of REST at the protocol level. The Uniform Interface constraint can be catered for, however, in BORED it is not a constraint of the protocol and must be defined in the message data structure. This is an area that still needs to be explored to work out how to combine the two seemingly opposed constraints. This will be expanded further in future posts when the message data structure is explored.

In the on-going REST debate, Tim Bray provided a good description of the trade-offs found in REST versus other systems. These are good things to keep in mind while designing BORED. It reminded me that I had'nt described the layered approach used in BORED.

In the BORED model there's at least three different layers. Defining layers in a protocol ensures that concepts of each layer does not infect other layers. The following compares the traditional OSI 7-layer model with HTTP and BORED.

OSIHTTPBORED
applicationapplication (browser/client)object receiver(client)
presentationmime (presentation)message (DATA)
sessiontransfer (HTTP)transfer (BORED)
transporttransport (TCP/IP)transport (TCP/IP)

I've had a few conversations where people viewed the OSI 7-layer model is seen as out dated and not very useful in today?s protocol developments. This may be true, however I still find it useful as a backdrop to understanding the layers of different protocols. By modelling a protocol stack using this type of layering provides another view of the protocol.

In the HTTP model the HTTP transfer protocol is easily recognisable as fitting the OSI session layer. It sets up the structure of the conversation between client and server. The data returned by a GET request specifies the mime-type which sets the presentation format for the response. The uniform interface specified by HTTP spreads across both the session and application layers. This unclear distinction of which layer the GET belongs is one example of how having a layered stack model can ensure each layers purpose is well understood.

In the BORED model, the transfer part of the protocol will define the request/response semantics and setup the basis for communications. The transfer layer will also provide the security and location of where the message will be delivered. The message data layer defines the presentation layer of the model. The message data should provide all the data required by the Object Receiver to perform its request.

A good analogy is a physical envelope. The transfer layer is the envelope which has the address, any routing information, the sender and any security information. The message layer is the paper that is put into envelope. The paper can contain any sort of information that the recipient can process. The Object Receiver layer is the actual data contained on the paper and directs the Object Receiver to perform an action. By ensuring the each layer is self contained, the whole system will be more flexible and easier to work with.

This post outlines the model for the BORED system. It has constrained BORED to request/response semantics directed to an Object Receiver. It has outlined the Remote Message Call (RMC) semantics used to create a solid distinction between it and Remote Procedure Calls (RPC). Finally, it outlined the layers in the protocol stack so that each layer can be analysed and its purpose described independently. In the next post I'll do a first cut of the logical elements of the protocol.

Got any comments, you can leave them on the shadow blog here.

18 August 2008 - Binary Object REst Distributed (BORED) system - Part 1

I was recently involved in a long discussion over at Steve Vinoski's blog regarding RPC and REST. The discussion was been long and multi-faceted covering definitions of RPC, REST and various other aspects of distributed computing. Steve has recently closed the loop on the discussions referencing some comments from Stu Charlton which offer a higher level perspective. The whole thing is a good read if you're into learning about the innards of the web and views on distributed systems.

One of the benefits of having these types of discussions is learning new perspectives and technology. There's nothing like getting into the nitty gritty and working out where opinions and ideas intersect. One of the things I learnt along the way is that I had misunderstood the meaning of REST. I was told to go and read Roy Fielding's (the person who coined REST) PHD dissertation. Unless you've read Fielding's dissertation it's most likely that you don't actually know the true meaning of REST. To quote myself in Steve's blog after I had read Fielding's dissertation:

12 July 2008 - Protocol Buffers

This week Google released Protocol Buffers to much acclaim, receiving coverage on slashdot, osnews and other places. I must admit that I haven't looked too closely at Protocol Buffers implementation. This is because I read as far into the documentation to discover that there's a set number of primitive types that can be encoded onto the wire. This basically equates to yet another adaption of the Sun XDR data encoding method used in the original RPC.

I must admit that I did do a little astroturfing on the Protocol Buffers google group which did actually result in a few hits to the einet web site. It didn't equate to any downloads or any actual interest in Argot which is quite disappointing. Obviously astroturfing is not the right way to advertise a new technology.

There's also been a few responses in the blogosphere regarding this release from Ted Neward, Steve Vinoski and Stefan Tilkov. The latter two have simply deferred to Ted's entry which comprises a good analysis of Protocol Buffers. As I read this analysis, I started to think how Argot solved many of the issues pointed out in Protocol Buffers. I was going to respond to Ted's blog, but its probably easier to do this here on my own Blog.

Before getting into responding to Ted's analysis, I should say what Argot is all about. At its most basic Argot is about defining meta data for describing binary data formats and a library for encoding and decoding that data. What makes Argot interesting is that the meta data itself is encoded in binary. This creates the ground work for dynamic data agreement and various other things. There's a full description of Argot here, so I won't bother with the technical implementation details right now.

The first issue Ted found with Protocol Buffers is their claim of language and/or platform-neutrality. I must admit I do claim the same with Argot. I also still regard this as a reasonable claim, and believe Protocol Buffers claim is also reasonable. In my mind, this is not a claim that the solution has complete coverage of all languages. It's simply a claim that no special language tricks or platform specifics have leaked into the data format. Compare this against RMI or other language specific solution. To implement RMI on another language would be difficult as its core data encoding requires Java specific information. I do understand Ted's issue with this and agree that XML's coverage is huge. Currently I have implementations of Argot in Java, C and C#, although I'm currently only working on updating and improving Java. So yes Argot is designed to be language and platform neutral but is currently best supported by Java.

Next Ted reminisces to when binary formats were big. Actually, I don't remember them being big at all. I remember CORBA being the big thing. Other than ASN.1 there has been very few attempts to build structured binary data formats that are independent of RPC or other system. There's some big problems with binary formats being inflexible, tightly-coupled, etc because there has been no real interest in actually solving these problems at a binary level. I will come back to many of these issues soon, as these are the exact problems I built Argot to solve.

Ted moves on to talk about all the advantages of XML. I agree with most of his comments on XML. For what XML was designed to be, it has done very well. And it has stayed true to its initial principles. This is great, but XML is not solving many of the things it was not designed to solve. Yes this makes little sense intentionally. There are many situations where binary data formats are required for speed, size constraints, etc. XML is not the data presentation layer silver bullet. The problem Argot, Protocol Buffers and others are trying to solve is how to move binary data around in a fast, efficient way. Ideally many of the advantages and lessons learned with XML and XML Schema can be applied to binary data formats.

As a side note, one area that I've just started working in the last few months is investigating the issues of interoperability for home area network (HAN) devices in the Energy industry. There's some really interesting work being done to build interoperable systems. They are defining the data formats to ensure all the smart appliances will work together in the home. These are really hard problems which require binary data format agreements so even the smallest of devices can work together. This is not an area where XML will be accepted. Its a market where manufacturing items are costed in cents and the CPU processing power required for XML is not acceptable. However, these industries need to face all the problems that binary solutions provide without the tools and solutions that XML provides. Surely there's a better way than just accepting the old issues in binary data solutions.

To bring it back to Ted's analysis of Protocol Buffers. I'm sure he understands that there's a need for binary data. I think he takes issue that Protocol Buffers is claiming to provide similar benefits of XML in a binary format. I agree completely that Protocol Buffers does not provide most of the benefits of XML. The problem is that in this increasingly connected world where convergence will mean that every appliance in the house might soon be conversing, we really do need a solution. We actually need more people working on this and not just saying how wonderful XML is. I accept that XML is great in all that it achieves, but I don't accept that we can't have most of the benefits of XML in a binary format for where it is required.

Ted also refers to the XML binary InfoSet specification in his analysis. I think I need to be clear here that when I talk about requiring many of the benefits of XML in a binary format, I don't believe that XML binary Infoset or similar is the answer. These solutions still require complex decoding mechanism which once again do not work in many situations. We need the benefits of XML at a really basic binary level.

I'll now move onto the guts of Ted's analysis of Protocol Buffers. I'll now examine some of the biggest short comings of Protocol Buffers identified by Ted. I'll discuss: if Argot solves it, if its required by binary communications and if it can be done effectively.

  • Binary is hard to work with. Agreed. However, as already discussed its a requirement in many situations. Lets make it as easy as possible to work with. Argot solves the problem by creating and incredibly flexible meta data structure that can be extended. In fact I'm willing to claim that the Argot meta data format (which has similarities to S-Expressions) is more flexible than XML Schema and anything else out there. I agree that binary data will never be as easy as text for humans to read, however, in the situations where it is needed it is not humans reading it. In these situations we should build tools to make it easier for humans to read. The way to achieve this is through good meta data.
  • Binary is tightly-coupled. This is probably the biggest issue with binary. In fact for most situations binary solutions actually need to be tightly-coupled for each individual message. This is where Argot shines the brightest. It is the only solution I know that can allow two peers to dynamically discover each element's meta data. What this means is that a server and client can support multiple versions of the same data structure and pick the format that both will use. I believe this is a very important capability which is needed in binary solutions. It gives clients and servers the ability to change over time and still support sets of data formats. For more details read about the TypeMap in the Argot documentation. I call this allowing tight-binding and loose-coupling. An important note is that I have proved this with a server implementation of around 7kb. It does not take a huge amount of data as the meta data format is itself binary.
  • XPath. This is something I believe can be achieved using Argot, however, it is not something I've worked on. At the moment, I don't believe its an area that developers need solutions for when binary is required. Its a nice to have that could be added to Argot tools.
  • XSLT. I agree this is a great example of what can be built over the top of XML. I have a long term plan to build something like this, however, it requires a language around Argot which is not yet developed. Once again, this is not a core requirement for working with binary solutions that is required today.
  • Structureless parsing. This is an area that I've just started working on. I believe that with enough meta data you can parse anything. In the next release of Argot I've built a parser that is able to read binary data into a form of DOM tree without knowing the meaning of the data. This is an important step to building XPath, XSLT and generic binary data editor tools. This also makes it possible to connect to a server and dynamically discover its interfaces and message formats and then dynamically build requests and read responses. Solving this ability at a binary level should lead to some interesting results when all appliances in the home can publish their message formats and capabilities.

There's probably a lot more that I could add here about Argot. However, I'm trying not to make this a sales pitch. The point I'm trying to make is that XML is a great tool for many situations. However, there are plenty of situations where binary data is required. We need more people understanding what is needed to build binary systems that support many of the features already supported by XML. We also need to recognise that the situations binary data is used is different from XML. The set of tools and use cases are different. I know that it is possible to build tightly-bound and loosely-coupled servers that communicate in binary as I've done it with Argot. Maybe someone can prove me wrong with Argot; I'd enjoy the discussion.

19 June 2008 - Argot

Here's an example of the changes that's been made in Argot in the latest version. In Argot 1.2 an unsigned 8-bit integer:
u8: meta.basic( 8, 0 );

In the latest version, the text format is now using a form of S-Expressions. The unsigned 8-bit integer now looks like:

(meta.structure meta.name:"uint8" 
   (meta.fixed_width uint8:8 
       [ (meta.fixed_width.attribute.size uint8:8)
         (meta.fixed_width.attribute.integer)
         (meta.fixed_width.attribute.unsigned)
         (meta.fixed_width.attribute.bigendian) ] ))

There's obviously some big changes here. The name has changed from u8 to uint8 to be more in line with the C language definition. The other big change is that instead of defined using meta.basic, it is defined with meta.fixed_width. This new type includes the size and an array of attributes. The attribute type is abstract and can be extended define any number of attribute types.

The same definition can be shown in a very quick and dirty visual editor.

There's plenty to improve with the editor, but it shows the concept.

18 June 2008 - Argot progress!

After studying the idea of a language for Argot I realised that I needed to go back and make some changes to Argot to support it. Over the last few weeks I updated the Argot dictionary compiler and built an Argot 1.3 alpha. This latest version has required some major changes to the guts of Argot. Today I just finished some end to end testing which shows that all the changes are working.

The first was to move the Argot dictionary text format to a S-Expression format. This is the first step to allowing an Argot language which uses the same S-Expression format. The Argot compiler can now over time be updated to read any type of data.

The next was to update the Meta Dictionary. The meta dictionary is the set of self describing data elements which forms the heart of Argot. This is normally a scary process as each small change ripples through other parts of the meta dictionary. While the changes were quite major they didn't change the more difficult aspects of the meta dictionary.

The changes included changing the meta.basic type to meta.fixed_width. This now has attributes which can be expanded to describe any fixed width data type. The other was to split the meta.reference type into a meta.reference and meta.tag type. Originally all references required a description. This wasn't required in many situations. Now the reference is a simple type reference and the tag provides a description. The other was to allow abstract data types to map to other abstract data types. This creates a more flexible type system.

The final change which has been the most challenging has been to create a Document Object Model(DOM) style interface to Argot. The ability to read in Argot data directly into business objects or into an abstract DOM structure. The current version has some code duplication, but works like a charm.

There's still some clean up to do on the code, but it should be released soon. I may even build a small example Argot editor to show off the new DOM style data.

PS If none of that made any sense whats so ever.. don't worry, it will all become clearer when an Argot editor is built!

14 January 2008

After a year and a half of not doing providing news, I've decided that in 2008 I'll actually start again. My aim is to document the design and implementation of a new browser. I still need to work out what to call it. More soon...

26 March 2006 - Working week

It's been a week of my new job in the Wireless and Location group at Sensis. The first week is usually a bit slow and painful as you try and get your head around a new set of tools and code. So far, it has been pretty good, and I've gotten up to speed quickly. The less brain strain in the first week the better.

On my first day I found a new tool to add to my collection of favourites. Emma is a coverage tool which tells you what percentage of your code has been utilized during tests. It's a great way to ensure that you've debugged enough of your application, or written enough unit tests. In other jobs I've used commercial versions of this type of tool, however, Emma is the first one that I've found that is completely free.

The other thing I need to get used to, is not having much time to work on my projects. Thankfully Argot and Colony are nearly at a stage I'm happy with. The biggest task left is to prepare Colony to be put on the public Subversion with the same license as Argot. After that I can move on to projects that use Argot and Colony, rather than continue developing it.

Speaking of other projects, after three months of hitting my head against a wall, I was able to add loadable modules to LuaPlayer. One of the smaller projects will be to add Argot and/or Colony client to LuaPlayer as a loadable module.

6 June 2006 - Too much to do..

It's already June and there's been a lot going on that I haven't been talking about here. The biggest news is that I'm now back at University, studying a postgraduate certificate in innovation management at the Melbourne Business School. What that all means is that I'm studying business at the top business school of Australia. The course has a focus on Innovation which should provide some great help for evaluating and commercialising my various software related ideas.

The course so far has been very interesting. The first subject, "World of Management" provided a nice overview of all the course. It also had a strong focus on the theory behind learning and skills required to make the most of the course. I must admit that the subject was geared towards students doing a full MBA. This meant that quite a bit of content was geared towards the skills needed to survive a three year part time masters course. Not an easy task! Overall I'm glad I did the subject. Last Saturday was the exam which was a bit of a shock to the system after 10+ years of not having to do an exam. It's nice to have the first one out of the way though.

My second subject I'm taking is "Organising for Innovation". As you can guess, this is all about how to structure business so that it can excel at creating and evaluating new and innovative concepts and then more importantly get them built. There's been some fantastic information in this course already and I'm only into the third week.

In other news I have finally taken the plunge and will be getting a car. This is the first car that I've ever owned after owning motorbikes most of my driving life. After looking around at many cars and deliberating way too much I ordered a new Honda CRV Sport. Of course it is black and comes with some fancy extras like sunroof, etc. To finish off the look of the car I also did something rather fun and silly. I purchased personalised number plates with...... ROXOR. It would have been better if they allowed r0x0r, but they don't allow lower case or the number zero. ROXOR will have to do! I pick up the car this friday which should be fun!

From all the above going on I have had little time to concentrate on other projects. Thankfully ps2dev.org has been moving along nicely the last few months. The forums have a good size community and the moderators have been doing a good job. It also look as though there will be a reasonable number of entries to the Fourth Creation Game competition.

I can't say the same for Argot and Colony. These projects need to mature a little more and find a niche. Without that communication happenning to drive development things are going to continue to move slowly.

7 March 2006

After a lot of deliberation last week, I now have a job! I'll be working in the Wireless and Location group at Sensis. Sensis are the company behind Australia's Yellow and White pages directories. They are also very serious about the internet and creating a lasting presense there. I think working in the Wireless and Location group will be a great fit for my skills and background with Argot and Colony. The job starts on the 20th of March, which gives me this next two weeks to get a few things done.

One of those things has been to fix Colony to work with the latest version of Argot. While I prepared Argot for release and created the C# and C versions, Colony became out of sync with the Argot interfaces. Bringing it back into line also added the features of strongly bound interfaces between client and server. Now every method is checked to ensure the request/response/exception parameters are the same. This tight contract between client and server has really proven its worth while upgrading ps2dev.org's web site software. Not a single byte change in an interface gets past Colony.

Speaking of ps2dev.org, I upgrade the site software today. The last upgrade was the start of August last year. I'd love to spend more time on the software, however, there's always so much I'd like to do. Today's software upgrade includes News items, locking and unlocking topic items, file handling changes, ability to delete topics and a few other things. Of course there's always plenty more to do, but it might be time to move onto other things.

27 Feburary 2006 - Lots happening

Hi Ho Hi Ho.. it's off to work I go.. again.

After about eight months of working on Argot, Colony and generally attempting to move Einet forward; it is time to admit defeat for now. I've accomplished the main goals of the last eight months, which was to release Argot under a shared source license. As always, it didn't go quite to plan. The Open Vendor license didn't get approved by the OSI, which was a disapointment. Argot didn't get released before Christmas which was also a disapointment. However, Argot did get written in Java, C# and C, and was released. A manual was written which describes the details of Argot's dictionary. Overall, the time taken was well worth it.

In the past month a lot of time has been taken up with looking for work. However, that task should be finalised this week.. after I've received two offers late last week. Now it's time to make the difficult decision of which one to take. Both are contracts, and both are interesting work. In the next day the final choice will be made.

In the past month I've also been working on Colony and resyncing it with Argot. It has required a few changes to the network type agreement interfaces, but it is turning out for the best. The Argot changes have been put in Subversion for anyone to check out online.

I've also started reading Ralf's blog which I found while Blog jumping last week. He has some smart theories on RPC and how to develop client interfaces for them. Have a read of Form should follow function and Coordination structures beat RPC. They offer some well thought out arguments on RPC methodologies.

I've also finished reading The Innovator's Dilema by Clayton M. Christensen. It focuses on distruptive technology and why large companies fail to adapt to new technologies. The author uses the Hard Drive industry as a case study, which is a lot more facinating than you'd expect. Do you know how many companies who made the big 14" hard drives for the original main frame market survived to make 8" hard drives for the mini computer market? Zero.. Read the book to understand why!

Finally, I've met up with the people up at NICTA/Melbourne Uni doing P2P research. They're doing some interesting research into data storage and retrieval using Distributed Hash Tables. They've even wrapped it all up and made a gaming platform to test out their theories. Hopefully after work has started I'll still have a chance to keep in contact with what they are doing. I also found out that they will be hosting Middleware 2006 in Melbourne later this year! Maybe it's time to try putting pen to paper and turn my Argot knowledge into a paper worth publishing.

19 January 2006

After some work with Makefiles, Nant and Ant the Argot packages are finally released. Each package includes the argot dictionary compiler; which for now requires Java. The files are availabe from the einet web site. If you have just happened to stumble across Argot, have a look. Argot provides the abililty to describe binary data so that it can be used more effectively in file formats, communications or anywhere data is written or read. Being available in C, Java and .Net(C#), it can be used for heterogenous systems. Being binary, it is effecient and can be used in small devices to internet services.

In the next few weeks there's still plenty to do. Thankfully the tedious job of getting subversion set up and packages released is done. Maybe I can get back to some code. In particular, it would be nice to make some changes to the client/server data type agreement protocol; maybe even come up with a short name.

17 January 2006

Another year has come to a close, and the festive season has wound down. The week before christmas I had planned to package up Argot ready for release and advertise, however, it wasn't to be. There's always Way too much to do, and way too little time. The break after Christmas has been fantastic with great presents, and visiting family and holidays. Now the break is over it's time to get back to grind stone.

As usual, I'm looking ahead to a new year and wondering how much of all the things I'd like to get done will get done. Most of the items on my todo list this year so far are small and with a bit of concentration shouldn't take too long. The major items are: to finally get Argot out the door and start telling the world about it, finish the PSP version of Send0r, and add improvements to ps2dev web site. There's plenty to do on all those fronts, so time to get to them.

9 July 2006 - Cutting through the tower of babel

In my current job I've been doing development at the front end of a web application. This has involved getting a lot more involved with technologies like CSS, HTML, Javascript, AJAX and plugins. This is on top of the requirement of developing in Java using Struts, Hibernate and database technologies. This tower of babel we have built for ourselves in computer science is one of the most bizare things to come out of the new milinium. To make matters worse, it is not just the large number of different technologies that is difficult to learn and put together. The technologies themselves change from browser to browser; css, javascript and ajax all have their idiocincrocies. There are complete books devoted to helping understand how to make web sites work on multiple browsers. This is not where I thought software development would end up.

The question I ponder is when is this pain going to end? A developers job is supposed to be to solve commercial problems and create competitive advantage for companies. I think we spend more time just trying to keep up with the latest methods and languages and in particular understand how they can come together and actually work.

I believe Argot holds a key in reducing this tower of babel. At the core of all programming and mark up languages is a text description of what is trying to be achieved. Argot is designed to record complex data structures to disk in a form that is quick and easy for programmers to read and write. The result is that the structures we describe in CSS, HTML, Javascript, Java, etc all become part of the one structure. This has some huge implications for how we develop systems and languages.

The most important aspect of applying argot to development is that it reduces each concept in all our languages to indivdual structures (words) in a dictionary. We have the ability with Argot to add or remove words when ever we like which allows the whole system to evolve. So where's this most awesome solution you might ask? It requires a lot more work than I have time to accomplish right now. If you're interested in throwing in a few lines of code, please let me know. Maybe together we could reduce this tower of babel to rubble.

The plan is reasonably straight forward. The most important aspect to be solved initially is creating an Argot editor. To accomplish this Argot requires an Abstract model which can read any argot data and allow it to be manipulated. The initial editor is likely to be written in Java and use a basic tree view to get started. From their the tree view can be replaced with a text representation and editor. This provides the language with a structure that more developers understand and are happy to work with.

After an editor is built it comes back to building the language. The most important aspect of an Argot based language is the words in the dictionary. It must be created so that it can be flexible and be open to change. Developing the initial set of words and a basic engine to execute them is likely to be achieved incrementally. It is likely that the language will have some basis in scripting or object oriented languages such as small talk. However it must also combine this with concepts from HTML and CSS to allow easy methods of defining look and feel for user interfaces.

If this language can be created the final part of the puzzle is to add communications. Argot was designed for communications and it is likely to be where the new language will find its biggest advantages over current systems. Using Argot objects and data structures can be transfered between peers with ease. Communications will be in effect written into the core of the language.

I suspect this would take me a couple of years to develop alone. There are many issues that will arise along the way. However, I think the goal is a realistic and smart one. The tower of computer babel we have created needs to be replaced with something new and well thought out. Argot offers a underlying key concept in achieving this.

21 November 2005

Nearly another week, and another step closer to having Argot out in the wild. It turns out that converting the network resolution code to C was a little more painful that I would have liked. Finally, I'm pleased to say that the first cut is done. Argot is ready to be sent out the door.. well.. nearly!

The last step in getting Argot out the door is to find an appropriate license. A full OSI approved license is not going to provide the right mix of freedom and protection which is ashame. Now that the OVPL is not considered open source its time to rethink the best way to allow open source to take advantage of Argot, yet still allow the business to grow and develop the software.

Surprisingly there really isn't much out there the provides a precedence for this type of license. The GhostScript AFPL and Java Research License touch on some of the required aspects.

Much like Java where controlling compatibility is important, Argot needs to develop but ensure commercial applications stay compatible. The Java Research License provides these elements. But to be compatible with open source we would like to allow Argot to be embedded in any open source software. It is this element that is difficult to find in a license.

Some more research to do. It will be absolutely fantastic to finally have Argot out the door. Very nearly there...

15 November 2005 - End of OVPL?

It has taken the Open Source Initiative(OSI) eight months to come to the conclusion that the OVPL and OVLPL do not comply with the Open Source Definition(OSD), and therefor will not be approved for use by the OSI. This has been a long and frustrating battle, however, I'm glad there is now some resolution. My response to the OSI had a touch of annoyance about it. I also posed a few questions to the OSI regarding their stance on future licenses to ensure everyone understands their new 'line in the sand'(the request and response is not yet archived online). If nothing else I can say that I helped shape the future of how the OSI interpret and enforce the OSD.

The result of this is that it is time to rethink what license is required for Argot. My initial feeling is that it is unlikely to be an OSI approved license. This gives more freedom in taloring the license to best fit Argot and users needs. The downside is that Argot doesn't get the OSI logo on it; c'est la vie.

The final nail in the coffin of OVPL is not quite finalised. We may look at using a BSD license back for contributions. However it will need some discussion to decide how this will effect the license.

In Argot news, the Argot Network Resolution code has been added to the Java and C# versions of Argot. It has been added in a way that abstracts the functionality away from any transport. It requires that the developer provide the transport mechanism as part of their peer to peer or client server protocol. That finishes up the functionality for those versions.

Only a few more things to do before release. Sort out this license issue and port the Argot Network Resolution code to C. Easy!

11 November 2005

Phew! After a day of turning my brain into mush I have the C version of Argot finished to the same level as Java and C#. The missing functionality was providing Argot Message Format readers and writers. This also happens to be one of the more complex and brain twisting elements of Argot. It's always nice to press the commit button and know that it is done.

The other thing I've added to the C version is CPPUNIT testing. Argot now has unit testing libraries for each version; JUnit for Java, NUnit for C# and CPPUNIT for C/C++. These testing frameworks are a real life saver when it comes to making sure functionality is the same for each version. I hope my XP friends are happy that their methods are rubbing off.

The documentation is done. The functionality or C, Java and C# is done. There's only a couple of things missing required before release. The first one is the approval from the OSI for the Open Vendor License. The second thing missing is adding the Argot Network Resolution(ANR) protocol into the Argot library.

The Open Vendor License is still up before the OSI board. They will be meeting next week to finally decide its fate. It still isn't clear which way they will go. If they decide in the negative it's going to be fun to work out the best way to release Argot. It might be back to the license drawing board.

The Argot Network Resolution protocol provides the data type matching facilities when Argot is used in client/server or peer to peer communications. Currently it forms the low level aspect of Colonly. However, its functionality is probably more closely aligned with Argot than Colony. Thankfully there isn't too much code to it as Argot already does most of the work. It might even be ready for when the OSI make their decision.

2 November 2005

In the last two weeks I've been concentrating on slowly bringing together a programmer's guide for Argot. It's now 34 pages and should provide a reasonable introduction to getting started with Argot. I'm sure it will grow over time.

I've also prepared the Java version of Argot to be inline with the C and .Net verisons. I use Java for testing out new changes so it had ended up with a few loose ends. It was nice to go back and tidy it up.

The next thing on the agenda is to find out what's happenning with the OVPL and OVLPL. These licenses will have been infront of the OSI board for five months on friday. The most frustrating thing is that the OSI board have made no clear indications of if they will approve or reject the licenses. Rejecting them will have to spur changes to the Open Source Definition as they clearly comply.

17 October 2005 - Fun with C sharp

Another week and another language. I had hoped Argot wouldn't take too long to port to C#. Well, with a little help from the Microsoft's Java to C# conversion utility it's all done in three working days. Actually it took three days of massaging the code from the output of the conversion utility. It didn't do a bad job, however, I wouldn't want to use it for any larger project.

I was expecting it would take a little longer to fix the code. Thankfully C# is close enough to Java that it didn't take long to fix up the changes. A few things surprised me about C#. The biggest one was the lack of a byte[] compare method. It's always the small things that bite when changinge languages. Searches on google showed it was a rather common query. It didn't take long to handle that one. Most of my time was spent either changing signed to unsigned data types. I never did like Java's ban on unsigned types. The other painful task was capitalising all the first letters of method names. It's something the Java conversion assistant probably should do! I was also a bit disapointed by the result of GetHashCode() on basic integer types. The result is the same as the value. It isn't hashed at all. This didn't take long to work around either. Overall I think the C# version will not look too much like converted java to the C# officianados.

I now have Argot in Java, C and C#! The only bit missing is documentation. This next task won't be my favourite, but documentation can make or break a project. So time to document!

12 October 2005

I've managed to shrink my RPC server to 7kb! My last test case used a Java client to Java server using the full Argot/Colony software library. This test case uses the exact same Java client connected to a hand coded C server over TCP/IP. The server uses a static type map which defines the full set of data types, interfaces and objects on the server.

The original aim of this exercise was to target a ATMEL ATMEGA8 microprocessor with 8kb of memory. Compiling the same C code targeted to the ATMEL AVR produced a 3kb binary! This provides a single function on a single interface. I suspect with a normal set of functionality this would creep up to 5kb. However, even this still leaves plenty of room for actual functionality to be included in the processor.

This now shows that Argot/Colony combination can create a strongly bound data contract between client and server on even the smallest of devices. This includes both data and interfaces. This is the exact same protocol and software used in full size service oriented architecture(SOA) solutions using Argot/Colony.

All the C skeleton code was hand coded in this test. In time I will create an IDL to Dictionary converter and then a Dictionary to C skeleton code generator which will speed development time. The only part not hand coded was the static type map; this was generated with the interface dictionaries.

It looks like this excersize of seeing how small Argot can go is a great success. I will have to wait and see if I can get some interest to actually create a light switch and light hardware to show Argot/Colony being used to turn on a light!

For now it's time to jump languages and move back to C#. Having done Argot in Java and C, I figure it would be nice to also include a C# version. This will provide good coverage of the most popular languages for Argot. The C# version was started last year. Hopefully it won't take too long to complete.

In other news, the Open Vendor Public License continues to be slow progress. If you ever have the thought that it would be really useful to have another open source license in the world. Be prepared! The OVPL has taken nearly ten months so far. We shall see how much longer it will take!

7 October 2005

About three weeks since I started investigating Objects and Interfaces. I've just managed to call a remote method. object.doSomething(22); returns the value 66. Remote procedure call(RPC) techniques like this remind me of a conversation I had last week with a friend. We were discussing the fact that I'm currently putting a lot of effort into developing a light switch which if I'm lucky could possibly turn on a light. Fifteen years of computer science training and experience culminating in the ability to flick a switch and turn on a light! Thankfully, the underlying techniques have a much broader application than a single switch.

This is the second iteration of the underlying RPC mechanism. The previous version relied on matching data types dynamically with methods on the server. This version creates an IDL style description of the methods which are then bound to the implementation interfaces. This allows the client and server to check the signature before making a call. If the client signature differs from the servers, the client can fail safely. I'm continually impressed with how the strong data contract between client and server allows me to quickly debug the communication aspects of the system.

Now the Java implementation is working I can move back to C and see how small I can build a service.

20 September 2005

I've been meaning to put up a photo of the offices latest painting for a couple of weeks now. It's called "4bit Argot" and is my version of a self describing painting. It has turned out to be a nice way to give a visual view of what Argot is all about. The painting is a binary view of all the information required to describe the information on the painting. Still confused? Maybe the meta description will help.

Each definition following has been encoded onto the painting using only descriptions included:

  • definition( 0 "u4" , basic( 4, 0 ) );
  • definition( 1 "basic" , sequence([ reference( #u4 ), reference( #u4 )]));
  • definition( 2 "abstract", sequence([]));
  • definition( 3 "expression", abstract() );
  • definition( 4 "sequence", array( reference( #u4 ), reference( #expression ) ) );
  • definition( 5 "reference", sequence( [ reference( #u4 ) ] );
  • definition( 6 "array", sequence( [ reference( #expression ), reference( #expression ) ] ) );
  • definition( 7 "expr#ref", map( #expression, #reference ) );
  • definition( 8 "expr#seq", map( #expression, #sequence ) );
  • definition( 9 "expr#ary", map( #expression, #array ) ) );
  • definition( 10 "map", sequence([ reference( #u4 ), reference( #u4 )]));
  • definiiton( 11 "expr#map", map( #expression, #map ) );
  • definition( 12 "expr#abstract", map( #expression, #abstract ) );
  • definition( 13 "definition", sequence([reference( #u4 ), reference( #expression )]));
  • definition( 14 "expr#basic", map( #expression, #basic ) );
  • definition( 15 "dictionary", array( reference( #u4 ), reference( #definition )));

The painting contains a dictionary definition. white is 1, black is 0. Can you decode the painting using the definitions above? There could be a bit or two wrong, let me know if you can find any errors. :)

The painting has some similar attributes and problems to what I've been working on the last week. The challenge has been to see how small Argot can go. The aim is to embed Argot in devices like light switches. As I mentioned last week, the target is 8kb AVR microprocessors. I was a little surprised at how easy it was. The data marshalling code finished up at 1kb and the Argot description at around 3kb. This was much smaller than I expected and leaves 4kb in an 8kb processor for functionality.

The aim of the light switch exercise was to allow a new device to be entered into a home network and have the device contain a description of all of its functionality. The home network controller is then able to configure itself appropriately to communicate with that device. Doing the excerise reminded me that I wanted/needed to put Interfaces and Object descriptions into Argot.

The focus this week is now Interfaces and Objects. After that is done I can have another go at creating a working prototype for the light switch.

14 September 2005

After two weeks of intense development the first cut of Argot for C is done. I've ended up with a head cold and feel terrible, but atleast Argot is basically done. The first cut is complete when Argot can read in a dictionary file of definitions of other elements. This is not only an essential feature, it also tests about 90% of the Argot software in one hit. This is all thanks to the self referencing nature of Argot and Dictionary files. My original aim for Argot was for the library to be less than 50Kb. I'm incredibly chuffed that my final test executable ended up at 32Kb!

One of the interesting things about writing Argot in C is seeing all the improvements I can make to the Java version. It's difficult not to jump straight back into the Java version and making it better, however, for now there's better things to do. I also promised myself to setup a build machine that will ensure all the different language versions of Argot continue to work together.

There are two other versions of Argot I'd like to have complete. The C# and Micro edition. Working with the C version I've had some great ideas for the Micro edition. Today I'll be loaning a development kit for some 8kb and 16kb processors. It's going to be a tight squeeze making them Argot enabled, but I think it's doable!

The C# version is going to involve re-porting the java version again with the assitance of the Microsoft Java to C# porting tool. With that done I'll have my four core languages complete: java, c, c# and micro.

Aside from the development I've been talking to people about Argot and getting feedback. Having to explain in details of the meta dictionary to other people has given me a better understanding. It has also helped me see some small things to tweak and improve in the meta dictionary.

So much to do! Argot Micro, Argot C#, Build Machine, Meta Dictionary improvements across all versions. That should keep me busy for the next month.

30 August 2005

Now that I have a blog, I suppose I should keep it up to date. After finishing the web sites last week, I put two papers online the Einet web site. The Argot White Paper provides some high level information on Argot as well as some low level technical aspects of Argot. The other new paper is the "Creating Evolvable Programming Languages", which demonstrates how Argot can be used to encode languages. This is opposed to the traditional method of encoding languages using text and syntax. Both are available in the Articles section of Einet.

With the web sites up to date, I've moved back to working on the C/C++ version of Argot. The C++ version of Argot was written early last year. However I discovered that C++ exceptions, RTTI, STL templates and virtual functions caused a lot of bloat. This time around I've decided to turn as much as I can into C.

After working with Java for a while, it's actually a refreshing change to write code in C. I've got all the fun of playing with function pointers, structures, unions and memory management. This refactoring will probably lead to changes that can be put back into the Java version. It's very interesting to see the same program in the light of a different programming language.

One of the disadvantages of working in C over Java is that you need to build your own abstract functions, like stacks and queues. This it turns out to be an advantage because unlike Java, these can be tailored specifically for the application. A simple case of this was developing a small integer to integer hash map(intmap). I was able to create a fast integer hash map in 1.5KB. This intmap will be heavily used in Argot, so making it small and fast will pay of well.

In other news the Open Vendor Public License (OVPL) continues to be discussed on the OSI license-discuss mailing list. There's plenty more that could be said on this; maybe a topic for the next post!

21 August 2005

Finally, the new web site is now online. This completes the process of getting my various projects up to the same level of software. www.livemedia.com.au, ps2dev.org, www.einet.com.au and openvendor.org are all using lime site(name subject to change) software. From now it should be easier to manage upgrades as I add features and fix bugs.

This web site is also the first to be fully designed with CSS. Overtime I'll redo the other sites using CSS. I'm especially looking forward to updating ps2dev.org with CSS. It's in desperate need of a facelift. Maybe with it CSS enabled, I can encourage one of the members with some graphic abilities to have a go (hint hint).

There is still plenty to add to the software. Over the next few entries I'll hopefully explore some of the items that need to be completed.

16 August 2005

Here's my first entry. The site isn't even live yet, however after nearly two years of having a very basic web site online it is time to refresh the site. Live Media will continue to be my home base as I continue work on my various projects. Hopefully this blog will both let other people know what changes I'm making and planning to various sites and software I manage through Live Media.

Copyright 2005 © Live Media Pty Ltd
Legal Notice