Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Microservices and the First Law of Distributed Objects (martinfowler.com)
149 points by resca79 on Aug 13, 2014 | hide | past | favorite | 69 comments


The real strength of micro services is physical isolation more so than logical isolation. That is, there is an 80/20 rule as to scaling.

For instance, there is probably one function of your system that is responsible for a huge amount of work. If everything is in one database (say SQL, mongo,...) You have a complex system that is hard to scale. If you split the heavy load out, you might find the scaling problems vanish (because the high load no longer has the burden of excess data) and even if there is still a problem it is much easier to optimize and scale a system that does just one thing.

The most disturbing thing about microservice enthusiasts is that they immediately jump to: oh, we can write these services and clients in Cold Fusion, Ruby, COBOL, Scala, Clojure, PHP and even when we write them, the great thing is "WE DON'T HAVE TO SHARE ANY INFRASTRUCTURE!"

That's bougus to the Nth degree because a lot of the BS involved with distributed systems has to do with boring things like serialization, logging, service management, etc.

I think you still want to use the same language, same serialization libraries, management practices, etc. across all of these services otherwise you are going to get eaten alive dealing with boring but essential problems.


The real strength of micro services is that it should force you to specify your services by API and protocol. This is what allows it to scale to larger groups developers at the same time as scaling to large customer bases.

Both Netflix and Twitter, for example, have fallen into the trap you describe of using a common set of infrastructural libraries. With Twitter this has manifested by their micro services actually being a tightly coupled distributed application, not a loosely couple set of services. Netflix on the other hand has the set of libraries, yet have many groups not using them and because they went with libraries instead of APIs/protocols the non-library using applications don't follows the standards.

You need fully specified APIs and protocols with at least a 2 if not 3 different implementation platforms to keep you out of the trap of the common infrastructure.

[edit: fixed capitalization]


The dream of reuse has made a mess of many systems. The value of components (you can call them microservices if you like) lies in independent entities communicating over well defined interfaces. This means that you can split work up into manageable units (and distribute the work across teams if you want to). But if things are genuinely independent then we have to accept that there will be some redundancy. As long as you stick to communicating with another component through its top level protocol/interface then things are OK. The temptation is to notice that there is something else the other component does internally that you would like to re-use in your implementation. So you ask them to expose that. This is where it starts to go wrong. You have increased the complexity of interaction with the other component just to get some re-use. Do this a few more times and the system becomes unmanageably complex. Its like any organisation. If you want to succeed you need to be able to delegate fully and don't try to micromanage people for the sake of some minor optimisation.

So the thing is to be disciplined. Only communicate over the formal interface and mind your own business/implementation. That means don't get involved in other peoples business and when other people try to get involved in yours tell them to $@&*!... please move along. Michael feathers has a good post [1] about how in COM it was hard to create components and so this enforced this kind of discipline. I think that another thing about COM was that you never got the source code for a component, it was always binary and the only thing that was published was the interface. This prevented people from nosing around in other peoples implementation. Feathers' point is that microservices are similar: Its hard to create a microservice so the granularity does not end up getting too fine. But to my mind there is no need to pay the price (failures, administration) of having to distribute a service just to get low granularity. Sure, if you want to distribute in order to scale that's fine. But otherwise just have your components in process and be disciplined about your interfaces. (Besides, microservices aren't that hard to make; Once you have made a few of them so you are going to have to have discipline there too)

[1]https://michaelfeathers.silvrback.com/microservices-until-ma...


There's a way to eliminate the reuse problem as well. If you make the services event-based and only let them speak to each other through messages then a service can never use another service. Services only listen and dispatch, no interservice request/reply calls.


Docker is a promising building block for solving logging and service management and that's something I've been playing with for a while. You stream your logs to your container's stdout/stderr and can pick them up from the Docker REST API. You can also hit the Docker REST APIs across your hosts to discover the IP addresses and ports of the containers running various images. None of this is "solved" yet but it's getting there.

Serialization... depends, but you can get pretty far with JSON over HTTP.

So I guess you do need your microservices to be uniform in some respects, but the consistency of "every microservice is a Docker container exposing a REST API that speaks JSON and streams logs to stdout" would seem to be enough.

It's easy to build that in several languages... if Rails makes sense, use Rails. If Sinatra makes sense, use Sinatra. If Java makes sense, use Java.

You could run into trouble if you start using languages that only one or two of your staff members know, but it does allow for some flexibility (i.e. just because certain parts of the application are enterprisey doesn't mean everything has to be.)


Docker is currently switching to their own microservices library called libchan - https://github.com/docker/libchan

It is msgpack over SPDY+TLS. Every language that will want to talk to docker will need a libchan implementation eventually.

disclaimer: I'm the co-founder of the project to build the node.js version: https://github.com/graftjs/jschan

My actual goal is to pull this into the browser though, and figure out what we can do with that : https://github.com/GraftJS/graft

Until that is ready though, we have already built out our own microservices 'toolset' that does the [1] docker based deployment,

and a small library that handles the [2] REST API, and that handles the [3] logging to logstash/kibana

1. http://longshoreman.io

2. https://github.com/wayfin/waif

3. https://github.com/wayfin/dockstash


Hell yeah, latency matters and msgpack + SPDY beats the heck out of JSON/http.


There are few issues with this approach. First, many security regulations require deterministic capture of audit-oriented log events (e.g. login failure/success, invalid access attempts, etc). Typically, you are required to provide a local spool to handle circumstances where a remote logging service is unavailable/overloaded. Simply logging to stdout is insufficient and hoping it gets picked by a process over HTTP is woefully inadequate in these circumstances. Second, as mentioned previously, HTTP headers will be larger than most logging events. Therefore, this approach likely doubles/triples (admittedly back of the napkin estimates) the data being transferred. rsyslog and its ilk learned this lesson long ago -- using a lightweight TCP protocol to minimize transfer overhead. Personally, I favor a locally spooling rsyslog instance in each container configured to push to a central logging service.

Besides the issues of implementation, log data needs to be analyzed. Standardizing on a logging library and associated configuration provides important consistency to use a common set of log analytics in alerting systems not to mention keeping system admins performing forensics operations sane. Finally, in addition to logging, there is service instrumentation which requires consistency to yield useful visualization and alerting for operations teams.

While Docker may allow permit a vast polyglot infrastructure, carefully choosing stacks deployed will have a tremendous impact on operational robustness. The greater the consistency across containers reduces the effort required to deploy and operate new services.


What you're attacking is not how Docker logging works.

The Docker daemon collects stdout/stderr from the beginning of each container's life, always, and stores it locally. You can then ask the Docker API for all the log events from a particular container ID at any time.

You would have one system that polls the Docker APIs on all your hosts and dumps the logs into the management system of your choice at some interval.

Log events are not going to HTTP live, you are not making an HTTP request per entry, and the event will be captured whether or not the network is available at the moment.


I definitely like the direction that Docker is going.

JSON over HTTP sucks. I mean, it's OK if user experience doesn't matter, but if user experience does matter, then latency matters, and if latency matters you find pretty quickly that serialization/deserialization overhead is a big deal -- particularly in a microservices environment where an external request is going to mean a large number of microservice requests happen, which means data could be serialized/deserialized many times. If you're happy being in the middle of the pack, go on with JSON, but if you are playing to win you need something faster (preferably with a framework that means development as easy or easier than JSON)

With multiple languages you don't just have the problem of training but you have other complexity-multiplying problems.

For instance, one of the reasons why languages like Haskell are always a bridesmaid and never a bride is that they lack a complete "standard library". You need a urlencode() and decode function if you write web apps. If you write your own you may be more concerned with making it tail recursive than making it correct. I remember how the urldecode in Perl was busted for a decade and never got fixed, and that's a mainstream language. As flawed as PHP is, it took off because it had "good enough" implementations of everything web developers need in the stdlib.

In a microservices system there tend to be functions similar to urlencode that need to be replicated across the system, and if these are implemented over and over again in different languages it is inevitable that some or all implementations will be buggy. If you can centralize these things in one library everybody depends upon, however, life is easy and happy.


> For instance, one of the reasons why languages like Haskell are always a bridesmaid and never a bride is that they lack a complete "standard library". You need a urlencode() and decode function if you write web apps.

Okay:

  cabal install urlencoded
> If you write your own you may be more concerned with making it tail recursive than making it correct

You know, if there is one criticism I've never heard -- or expected to hear -- of Haskell programmers in general, its that they are overly concerned with optimizations over correctness.


You don't need to use the same language to use the same serialization library, however. If the latency of HTTP/JSON is truly too high, there's Protocol Buffers, Thrift, and a host of others in that space.

HTTP/JSON is nice, however, for the wealth of tooling that already exists and the relative ease of standardizing API design with REST.


+1 for Thrift.

If you like the REST architecture it works just fine with the message body being Thrift or whatever. I think of the ElasticSearch API which lets you use Thrift or XML or JSON.

The interesting distinction between binary serialization formats is if the schema is separated from the data.

For instance, if both sides of the system know you are encoding a 24 bit color value you can send 3 bytes and that is it; coding and decoding can be very quick and even possibly done on a "zero copy" basis.

If you are using something like JSON you not only have the waste involved with converting "255" to an #FF byte, but you also have to embed the schema in the sense of "here is an array of three integers (which happen to be bytes)" or "here is key 'red' and value R,..."

Thus, JSON is not "schema-free", it is "schema embedded in the data" and this inevitably bulks up the data and slows down encoding and decoding. Yes, general-purpose compression eats some of the storage/network encoding overhead, but you'll get it even tighter if you eliminate that fat before you put it through the compressor.

Now, separating the schema from the data means you need to make both sides aware of it, which is why you need some strategy for handling this in a systematic way rather than hoping things will work out OK without a plan.


> With multiple languages you don't just have the problem of training but you have other complexity-multiplying problems.

> For instance, one of the reasons why languages like Haskell are always a bridesmaid and never a bride is that they lack a complete "standard library". You need a urlencode() and decode function if you write web apps. If you write your own you may be more concerned with making it tail recursive than making it correct. I remember how the urldecode in Perl was busted for a decade and never got fixed, and that's a mainstream language. As flawed as PHP is, it took off because it had "good enough" implementations of everything web developers need in the std lib.

I don't understand what you're saying here - are you arguing that we shouldn't use languages that don't have web functionality built into the std lib? What about web frameworks, e.g. Java + Spring?


Haskell's libraries and packages are actually very well fleshed out at this point. That's not the reason for poor adoption. It really is just that it's academic and mathy and difficult to understand.


In Java you can find implementations of that stuff and will probably use it. If there is anything wrong it is that there are too many different implementations out there.

In the case of Haskell you ~might~ find implementation of that stuff but you'll probably think you're too smart to have to reuse somebody else's code and also be too smart to have to deal with the corner cases.


> In the case of Haskell you ~might~ find implementation of that stuff but you'll probably think you're too smart to have to reuse somebody else's code and also be too smart to have to deal with the corner cases.

What is there besides extreme language bigotry to suggest that Haskell programmers tend to think they are too smart to do either of these things? Haskell programmers use other people's code all the time. And Haskell programmers handle corner cases as much as any other programmers.


Your criticism of Haskell here just doesn't have the ring of truth to me. Haskell has lots of good, modern frameworks and libraries for doing web-related things, which aren't difficult to use or any more corner-case-y than their corollaries in other languages.

I get the vibe that there is some underlying point you're attempting to make and that the Haskell-specific stuff is secondary. Maybe your point is that using a technology stack with unproven maturity in a given domain (in this case web application development) is riskier and likely more time-consuming than using the more common stacks for that domain. If so, then I agree. (But I'm also very appreciative of the early-movers who put in the time and effort to make immature ecosystems around nice technologies more mature; somebody has to do it!)


This is a silly comment, being that Haskell is very library and reuse oriented.


I'm curious--what sort of workload are you looking at where JSON/HTTP is insufficient?

We do a lot of realtime signal streaming, and haven't had many problems. We'll be switching to a different format (or embedded format) to better handle some numerical pickiness on our end, but the fundamental transport is fine.


I think 'streaming' is the operative difference there. If you're opening a connection and sending large amounts of JSON 1-way then header/encoding overhead is basically not a problem. But a 2-way RPC protocol is basically making a separate HTTP request each time. For log file messages you're going to be sending more header more object definitions and headers than data.

I'm curious how you're formatting your JSON too - some formats are going to be way lighter then others.


Ah, good point on the RPC stuff!

That suggests pretty much mandatory use of keep-alive headers in infrastructure services, no?


Why not SNMP? It seems to be the protocol everyone loves to hate but there's a ton of ecosystem around it and it directly solves the status/discovery case


I think the advantage of the "We don't have to share any infrastructure" mindset is that standardization is a process, not a destination. A living system will always have some diversity of libraries, versions, languages, etc. While it's important to keep it under control, a system that is robust to being half written in PHP and half in $goodlanguage is much more consistent with the messy reality than a plan of "Step 1, port everything to Ruby, Step 2, write a single set of SOA tools, Step 3, profit"


Of course, if you can slip new services under your existing PHP, ColdFusion or whatever backend, that makes a lot of sense.

If you commit to supporting a finite number of languages, that's one thing.

On the other hand, if you are going to introduce a new language every month you're on the road to hell.


I've always found it best to allow the developers the freedom to use whatever they want as the official 'standard' and allow them to self organize into a few de-facto standards. You naturally standardize around better tech and attract better developers while not restricting future innovation.

The one problem with this system is that it can be fairly easy to torpedo when you get new management who sees the lack of rules as a lack of organization. They will then enforce some arbitrary standard and waste a lot of resources figuring out their mistake.


> I think you still want to use the same language, same serialization libraries, management practices, etc. across all of these services otherwise you are going to get eaten alive dealing with boring but essential problems.

Standard formats and protocols are probably more important than same languages and libraries for that particular issue; but, at the same time, not being tied to a particular language/library doesn't mean you should add them willy/nilly. It means that when a particularly component has a compelling reason to use a different language library, you aren't constrained to not do that (or you decide for a good reason to begin a major transition from one language to another, or one library to another, you can do it incrementally rather than in a big-bang transition or behind an additional adapter layer.)


The key is "compelling reason".

Adding multiple languages is, by itself, a bad thing. If there is a good reason to add another language (like a lot of stuff already written) than the good outweighs the bad.

I hear so much crazy stuff about microservices like: we can have different teams making all the microservices and they don't need to share any people or talk to each other at all, etc.

Practically microservices contribute to real world agility not when they encourage silo-building, but when you can move developers from task to task as necessary. Perhaps Microservice A has reached stability and doesn't need to change much. The more standardization you have, the more you can move somebody who worked on A to work on microservice C, and the have somebody on B make a quick fix to A when the first guy is busy on C.


> I hear so much crazy stuff about microservices like: we can have different teams making all the microservices and they don't need to share any people or talk to each other at all, etc.

Minimizing coupling between components -- teams -- in the development process (as well as in the deployed systems) is a major benefit of SOA/microservices.

> Practically microservices contribute to real world agility not when they encourage silo-building, but when you can move developers from task to task as necessary.

Sure, being able to share staff in series is a benefit. Not needing to share resources in parallel is also a benefit. Silo building is harmful, because sharing best practices and tools that are a good fit for different teams tasks are a good thing.

Coupling that limits flexibility of one team to meet its requirements because of conflicts with another teams needs are bad. There's a big difference between enabling sharing (which is mostly a social/process thing) versus requiring sharing (because of poor architecture choices.)


That silo-building is sort of the logical extreme.

The happy medium is more like "because our services are loosely coupled, we're free to make code changes to our service and not worry about inadvertently breaking any other team's service as long as we keep our API stable." Which is a huge win over a monolithic system where everything is tightly coupled.


I think the real strength of microservices is that it encourages software that is structured like the teams that wrote it.

The general rule is that this is true anyway. However having a strategy that actually encourages it removes many communication headaches that accompany monolithic or highly distributed builds.


Is that an argument for a "one executable per team" architecture? I'd support that, but it seems closer to what most people mean by "monolithic" than what most people mean by "microservices" in my experience.


I find that the reason there is so much discussion about microservices and scaling object oriented applications is due to the limitations of object orientation in the first place. These same limitations are the reasons why distribution and network/local transparency, something Martin Fowler states he doesn't believe works, do work in functional programming languages but don't work in OO languages.

It all boils down to OO programmers want their applications to be scalable and maintainable. They have decided the way to do that is through modularity. But we suck at enforcing modularity in a single code base. This has been proven time and time again. Microservices are just a sneaky way of forcing that modularity on ourselves. Instead of designing your system as a single ball of mud (monolith), you'll design your system as a puddle of mud (microservices).

It is just entirely too hard to write good, modular OO programs. This is why we hang onto every book, blog post and word the Object-Oriented Gods send down to us. OO could be a great and amazing thing for certain domains of programming. By all means, create monoliths and use Martin Fowler's Cookie Cutter Scalability solution because it is simple. But if you find yourself needing microservices, you're better off picking up a functional language where modularity comes naturally.


I disagree wholeheartedly.

If you read that post again you'll see that Fowler's admonition about "not distributing objects" is really about "not distributing method calls", and method calls aren't that different from function calls so far as his argument is concerned in that (1) the latency of distributed function calls is 10^3-10^6 greater than local function calls, and (2) distributed function calls often fail for reasons inherit to the network and not the application itself.

I'll agree that functional programming languages have many virtues, in that you can compose pure functions, run them in parallel and so forth, but functional programming does not make the issues Fowler talks about go away.


> If you read that post again you'll see that Fowler's admonition about "not distributing objects" is really about "not distributing method calls", and method calls aren't that different from function calls so far as his argument is concerned in that (1) the latency of distributed function calls is 10^3-10^6 greater than local function calls, and (2) distributed function calls often fail for reasons inherit to the network and not the application itself.

The expense of distributed method calls comes from the fact that you are dealing with objects -- that is, containers of mutable state -- where you either need to synchronize copies across the network (expensive in terms of traffic/latency) or run all method calls on the place where the object lives (likewise expensive in terms of latency).

Pure function calls don't depend on mutable state, and so don't have those issues (further, to the extent that you do set up a system that requires a remote call to execute them, once you've executed a given pure function with a given set of arguments, you don't need to do it again, as the result is cacheable and guaranteed good forever.) You still have to distribute the calls that actually depend on mutable state, but if your language starts out with a clean distinction between pure functions and operations whose result might depend on mutable state, you start out miles ahead when it comes to dealing with distribution compared to working with a language which provides no language level distinction between pure and impure code, requiring either manual segregation or treat-everything-as-state-dependent-and-pay-the-cost approaches.


Practically speaking, function calls over a network are by definition impure.

Any function that is capable of returning varying results for the same arguments is impure. And a function call across the network could return anything - a correct response, a corrupt response, no response - depending on what the network's having for lunch at that moment. We do have clever tools like the I/O monad, yes. But the I/O monad doesn't take I/O and make it pure; it takes I/O and wraps it up in a clever facade that (usually) lets you pretend that it's pure.


Whether a function is logically pure or not precedes the decision to implement via a network call and, in the event that a pure function is for some reason implemented via a network call, shapes how you implement it as a network call (as it affects cacheability, etc.)

Obviously, the medium of a network call is inherently impure -- but in the same way that anything implemented in an any physical device is (including internal computations in an electronic computer without network communication.)


Technically, yes, you can follow that reductio all the way down the rabbit hole. Practically speaking, though, it's not generally considered problematic that there's no way to set a timeout before issuing the CALL instruction.


There are ways to set a timeout before issuing (a particular) CALL instruction. Occasionally these are important.


> The expense of distributed method calls comes from the [need to] run all method calls on the place where the object lives

Which is exactly the same in functional as OO languages. If service A has some data, and service B needs to make use of that data, then service B needs to make a call to service A.

The OO/functional axis is orthogonal to the monolithic/distributed axis.


What's expensive is shared mutable state. Isolated mutable state is fine - it can live in the place where it's used. Shared immutable state is also fine, as it can be distributed once and then reused as much as you like, and there's no need to worry about keeping all the copies in sync.

Functional programming is all about avoiding shared mutable state, so it does have an advantage here.


They might even make it worse. Imagine filtering a list using a predicate that hits a remote service on every call. What is correct behavior for higher-order functions which accept function arguments that may time out?

The official answer is probably that you'd use some sort of monad to sweep that mess under the carpet. At which point you've just proposed a very close functional analogue to some of the worst shenanigans that used to happen with distributed objects back in the '90s.

A better answer would be to say that you use something like Erlang's actor model. At which point you've proposed something very similar to - perhaps indistinguishable from - microservices.


> Imagine filtering a list using a predicate that hits a remote service on every call.

If its not a pure function of the type of the lists elements, its not really a predicate. (Though, in a language which doesn't distinguish pure from impure code, it might look like one.) So the question is somewhat incoherent from the start, though the problem is a real issue.

> What is correct behavior for higher-order functions which accept function arguments that may time out?

No different from any other operation that can timeout (or otherwise fail).

> The official answer is probably that you'd use some sort of monad to sweep that mess under the carpet.

Monads don't sweep mess under the carpet, they expose it. And they aren't really an answer to "what is the correct behavior...", they are an answer to "once you've determined the correct behavior, what language featuers do you use to implement it in a pure functional language."

But, yes, in Haskell or a similar language, the question of "how do you deal with this external interaction that can fail" does likely involve the use of monads.

> At which point you've just proposed a very close functional analogue to some of the worst shenanigans that used to happen with distributed objects back in the '90s.

That's quite a leap, since invoking monads doesn't specify anything about functionality.

> A better answer would be to say that you use something like Erlang's actor model.

That's actually an orthogonal (non-)answer, since (1) the Actor model isn't an alternative which excludes monads, they operate in different conceptual domains, and (2) it still doesn't answer the "What is the correct behavior...?" question.

> At which point you've proposed something very similar to - perhaps indistinguishable from - microservices.

Yes, microservices are closely related to the actor model. Yes, monads are part of how you'd implement either in Haskell or a similar language. One isn't "better" than the other.


> The official answer is probably that you'd use some sort of monad to sweep that mess under the carpet. At which point you've just proposed a very close functional analogue to some of the worst shenanigans that used to happen with distributed objects back in the '90s.

I don't think it is the same, because the distinction is visible. Monadic code looks almost but not quite the same as pure code; it's not a lot of syntactic overhead, but it's enough to make it clear that something's going on. You can see right there in the type that this is invoking a remote call, and if you try to pretend that a distributed function is a regular function then the compiler will catch you and stop you. I don't think this is at all the same as CORBA, where there was literally no visible difference.


I think there's something quite exciting there in the functional habit of separating out pure from impure. Done well[0], it ought to actually make the difference between local and remote much more obvious.

[0] Facebook's Haxl library comes to mind


I'm interested to hear you back up that point about modularity coming naturally in functional languages. Not because I don't believe you -- I've worked with both Haskell and Scala and found both quite refreshing. But I've also encountered codebases on functional languages that are just as muddy and rigid as some of their OO counterparts... So I wonder how much of that modularity is actually a result of the language.


I should have said "more modular" but I definitely don't mean that modularity comes for free in FP languages. Programmers are capable of writing rigid programs in any language but I do feel in my little experience of using FP languages it is harder to do so, or more obvious when you are doing so. I'll give it a try anyway.

I think the modularity comes from most FP languages having fewer building blocks to work with than most OO languages. It's the same reason why users of OO languages with a ton of different building blocks (Java, C#, etc.) find more "minimalist" OO languages like Ruby refreshing. FP languages tend to take this simplicity even further. You essentially have just functions and modules (a place to group related functions). FP languages also usually don't have state, unless you want to emulate that in your program somehow.

To me it is about ditching the OO way of creating some representation of the circle of life or Kingdom of Classes hierarchy in your applications for just treating your program as data that goes through a sequence of transformations. Linear programs are always easier for me to understand than hierarchies.

Rich Hickey's Simple Made Easy[0] talk is a great overview of the subject. Now his talk isn't about modularity per se, but I think modularity is one of the many things that fall out of simplicity.

0 - http://www.infoq.com/presentations/Simple-Made-Easy


I think also the real complaint people have against mainstream OO languages, particularly Java, aren't around "inflexibility" but rather around total ecosystem complexity.

For instance, in PHP there are JSON serialization and de-serialization tools built into the language and people just use those.

In Java on the other hand you have to pick a third-party library, find it in maven central, cut and paste it into the POM file which is a gawdawful mess because it is all cut-and-pasted so every edit involves a tab war so it hard to view the diffs, etc.

Then you find out that the other guys working on the system already imported five different JSON libraries, but worse than that, some of the sub-projects depend on different versions of the same JSON libraries which occasionally causes strange failures to happen at run-time, etc...

Ironically these problems are caused by the success of the Java ecosystem. When you've got access to hundreds of thousands of well-packaged software that is (generally) worth reusuing, you can get in a lot more trouble than you can in the dialogue of FORTH you invented yourself.


This is a great point. Just look at the logging situation in Java.


I've built FP systems in a few languages, and I don't see them as a step change in ease of writing "good, modular" programs.

If you could tell me which aspect of FP you think does provide this big change, or link me to a discussion, I'd be very grateful!


Ok, exist a sample where the contrast between both is clear (related to this area)?


This article is missing any mention of promise pipelining, which solves some of the problems being discussed.

http://kentonv.github.io/capnproto/rpc.html

With promise pipelining, if you need to make two RPCs to the same server, and the result of the first is going to be an input to the second, you can actually do it in one network round trip. The trick is to send the server a message saying "Hey, when you finish that first call, substitute the result into the parameters of this second call".

With this, fine-grained calls no longer imply an enormous latency expense compared to course-grained. Meanwhile, fine-grained APIs are cleaner and more composable, as my link above describes.

It's unfortunate that CORBA gave distributed objects a bad name. Just like object-oriented design within a program is more expressive than procedural design, object-oriented network protocols are more expressive than the flat protocols we tend to see today. I've been working with object-oriented protocols a lot lately while using Cap'n Proto to build sandstorm.io, and I've surprised even myself at how much more elegantly I can express complex interactions.

CORBA only messed up in trying to make remote objects look the same as local objects. Everyone now agrees that was a terrible mistake. But making distributed objects work does not in any way require making them look exactly like local objects. Calls to a Cap'n Proto object look quite different from local calls, because you need to be aware of the network issues implied by the call. But I've found that the same higher-level OO design principles you might use locally translate remarkably well to Cap'n Proto interfaces.


It may be terrible in a lot of other ways, but the mediawiki API does chaining of one api function into another. It's one of the features that I noticed that really should be more prevalent in other API's. An API call should be kind of like a mini-job submitted to the server that does some work and gets back some results.


I virtually never see metion of ZeroC ICE and OpenSplice/DDS. These seem like very well thought out and solid solutions for making distributed systems. Everybody seems to go on about thrift and REST and ZeroMQ and so forth, when I would much rather use DDS or ICE. The other approaches seem to amount to homebrewing your own approximation of these technologies much of the time.


ICE does not support promise pipelining, though.

http://kentonv.github.io/capnproto/news/2013-12-13-promise-p...

I haven't looked at DDS.


This is the spirit of microservices to me :

https://news.ycombinator.com/item?id=7874317

"Instead of pretending everything is a local function even over the network ..., what if we did it the other way around?

Pretend your components are communicating over a network even when they aren't?"

-- Solomon Hykes (of Docker fame) on LibChan

To me, it's pretty much anti-OO, and that's why I find it refreshing.


I don't see what's anti-OO about that. You have message passing (over the network) and encapsulated state. The actor model, in the large, resembles an idealized object system a lot more than an idealized functional system.

Network access (and file I/O, etc.) is inherently impure. Pure functional style requires you to work around that, and async/message-passing approaches are an effective workaround. But the fact that functional languages funnel you into async doesn't make classical Smalltalk-style OO any less well-suited for async.


Isn't that kind of like erlang?


it's very much like that. and erlang has proved that it's a valid approach.


Yep. All the distributed systems I've worked with devolve into a bad reimplementation of part of Erlang.


> Given this uncertainty, the most important thing a writer like myself can do is to communicate as clearly as I can the lessons we think we've learned, even if they are contradictory. Readers will make their own decisions, it is our job as writers to make sure those decisions are well-informed ones, whichever side of the architectural line they fall.

This is a really insightful description of the role of someone documenting software architecture.


I really admire the humility of his wait-and-see attitude, especially since he is trying to keep an open mind against his instincts.

This is a bit OT, but it seems like Angular apps have similar problems to distributed objects, where you can wind up making lots of network calls to retrieve one of these, all of those, etc. I'm curious what advice people have about that.


> This is a bit OT, but it seems like Angular apps have similar problems to distributed objects, where you can wind up making lots of network calls to retrieve one of these, all of those, etc. I'm curious what advice people have about that.

You need to make a distinction between remote and local calls. You need something that looks a bit like function calls - not a several-line ceremony for each call - but also that clearly isn't the same thing as a function call, and you need a way to make sure you don't get the two mixed up. You need (whisper it) monads.

I've heard that new versions of Angular will have Dart as a primary platform? Maybe they can make a type-level distinction between remote and local calls in that language?


> I worry that this pushes complexity into the interconnections between services,

I guess the instinctive answer is to well let someone else solve the problem. Grab something existing / standard (REST API + rest client, RabbitMQ + msgpack) or something similar.

What it still doesn't save you from is managing basic distributed systems issues - network partitioning, timeout, asynchronous starts and stops. Maybe it is better because by building this distribution into the core of the system and not trying to abstract it away behind an API (like the author says) it forces you to deal with them explicitly.

Overall I still haven't decided if microservices is just one of those buzz words invented because the old ones (Object Oriented, SOA, etc) have gotten old and don't bring in consulting revenue anymore.


I think microservices is a new evolution of an old idea, that is growing to term in an environment that could allow it to blossom.

As I posted elsewhere here, microservices are also explicit about being remote calls, and are built with the idea of assuming all your calls are always remote.

For instance, Docker is already built using a microservice architecture, and they are standardising that as a project called LibChan : https://github.com/docker/libchan

It's basically msgpack over SPDY+TLS, all the way down the stack.

disclaimer: I am one of the co-founders of jsChan, the node.js/javascript port of libchan.

https://github.com/GraftJS/jschan


I was wondering about the name change from SOA to Microservices myself.

I think both monolith and microservice architectures are valid, it just requires software architects to think a bit more about the size of the problem. If you just have a slow growing self-contained app, then monolith is probably the way to go.

If you have an app that has the potential to grow quickly, or is already integrating with multiple external services, it may make sense to look at microservices.

We switched from monolith to SOA a couple of years ago, and it's been a net win in terms of price, reliability of site performance, and ease of development. There were the obvious drawbacks you stated above of network timeouts, lag, and async issues, but it's just a different set of problems.


I believe they are grasping in the right direction, but what they really need to do is let go of the attachment to object oriented development. I don't mean to start the flame war, but if you insist that the architecture reflects the intent of the application, then why choose OO? If the architecture requires messages and buses and adapters, etc. what does OO bring to the table? Why do I think I need distributed objects? Am I choosing micro-services just because I want to try to get distributed objects working again?

It's not a language issue either. At this level we are talking about frameworks, models, domains, contracts, protocols, etc. This layer is not language dependent, although some languages are better designed to build frameworks that support these intents.

A classic example of how these assumptions creep into your designs is seen in the first chapter of Head First Design Patterns where they discuss at length how to create the perfect object model for a duck computer game. When I read that the first first think that came to my mind was, "Wait a minute, you are designing a computer game! Everything on the screen is a sprite. Sprites are moved around the screen by their coordinates once per game loop. How does the perfect duck object model help me here?"

Sounds very much like a hammer looking for a nail to me.

[edit] fixed a few typos


High level knowledge representation like description logics can make these types of discussions obsolete if they are applied to information systems holistically. Describing relationships between data and describing processes themselves using a common machine-processable language would allow the plumbing to be generated automatically and could even enable systems to automatically be converted from course grained message types to finer grained ones and vice versa.


Oh for the love of <insert deity here> no! This will now be printed and stuck onto the wall by the astronaut architects where we work, who blindly parrot MFs words like gospel, without any understanding.

Software used to be shit, but at least fun, then we had software architects, and now software is just shit.

Sigh.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: