Tuesday, April 28, 2015

Modern software architectures - PHP UK Conference 2015

Modern software architectures - PHP UK Conference 2015


The web has changed. Users demand responsive, real-time interactive applications and companies need to store and analyze tons of data. Some years ago, monolithic code bases with a basic LAMP stack, some caching and perhaps a search engine were enough. These days everybody is talking about micro-services architectures, SOA, Erlang, Golang, message passing, queue systems and many more. PHP seems to not be cool anymore but... is this true? Should we all forget everything we know and just learn these new technologies? Do we really need all these things?
Published in: Engineering


Transcript

  • 1. MODERN SOFTWARE ARCHITECTURES Ricard Clau PHPUKConference 2015
  • 2. HELLO WORLD • Ricard Clau, born and grown up in Barcelona • Server engineer at Another Place Productions • Not doing much PHP these days • Open-source contributor and occasional speaker • Twitter (@ricardclau) / Gmail ricard.clau@gmail.com
  • 3. WE WILLTALK ABOUT • Technical teams and Software architecture evolution • SOA, MicroServices, Distributed systems • Real time stream processing, data storage • Different languages for different problems • Where does PHP fit in this new world?
  • 4. WHYTHISTALK? • The way we build applications has changed • Lots of people over-engineer for no reason • An industry full of stupid trends and hypes • A bit of a rant talk
  • 5. APPLICATIONS EVOLUTION
  • 6. LOTS OF PROJECTS START… • Classic LAMP Stack • PHP monolithic codebase • MySQL (or any other RDBMS) • Millions of £ generated • Nothing fundamentally wrong!
  • 7. TRAFFIC STARTSTO GROW… • Nginx /Varnish • Cache servers • NoSQL databases • Search text engines • Queue systems
  • 8. BUT SOMETIMES… • Over-engineered systems for no reason • Hard to maintain and develop with • NIH Syndrome, reinventing the wheel • New problems from complex design • Ultimately harming the company!
  • 9. LIVING INTHE CLOUD • Pay only for what you use • Save money in operations • Embrace failure every piece can fail • Provisioned vs on-demand models • Many people waste a lot of money
  • 10. TEAMS EVOLUTION There is always some resistance!
  • 11. DEVOPS PHILOSOPHY • Infrastructure as code • Automation • Communication • Responsibility • No more Dedicated Ops?
  • 12. FULL-STACK DEVELOPERS • Very useful in startups • Impossible to be good at everything • It is good to have traversal skills • Look beyond PHP!
  • 13. ARCHITECTS / PLATFORMTEAMS • Not a big fan myself • They need to be building features • “Wisdom committee” idea • Founding engineers may NOT be the best suited for these roles
  • 14. DISTRIBUTED SYSTEMS Message passing, embracing failure
  • 15. DEFINITION AND MOTIVATIONS • A collection of independent computers that appears to its users as a single coherent system thanks to a middleware • Concurrency, horizontal scalability, resilience, fault tolerance… • Some things don’t fit in just one box, cannot be computed with the biggest instance available or users are spread all over the world and need low latencies
  • 16. DISTRIBUTED SYSTEMS ARE HARD Don’t blindly follow the trends!
  • 17. CHOOSEYOUR STORAGE: CHAOS!
  • 18. CAPTHEOREM • A shared-data system cannot guarantee simultaneously: • Consistency: All clients have the same view of the data • Availability: Each client can always read and write • Partition tolerance: The system works well even when there are network partitions
  • 19. “During a network partition, a distributed system must choose between either Consistency or Availability”
  • 20. Consistency Availability Partition Tolerance Mostly RDBMS (MySQL, PostgreSQL, DB2, SQLite…) Special nodes (Zookeeper, HBase, MongoDB, Redis…) All nodes same role (Cassandra, Riak, DynamoDB…)
  • 21. SOA & MICROSERVICES Doing one thing, and doing it right… right?
  • 22. SOA PRINCIPLES • Self-contained units of functionality • Share contract and schema • Can evolve independently, be reused • We need some orchestration • Stop thinking applications, think business processes
  • 23. WHY DOES SOA USUALLY FAIL? • It’s hard to explain the business value • Strong impact in the organisation • Sometimes, we do SOA “on the cheap” • People not skilled / experienced enough • New complexities are added
  • 24. MICROSERVICES • Small units of software following SRP • Replaceable, upgradeable, independent • Encapsulated, composable, client friendly • Fast startup / shutdown, testable • SOA integrates different apps, Microservices architect a single app
  • 25. HOW MICRO SHOULDTHEY BE? • No rule of thumb • Some people abuse from nanoservices • A microservice is NOT a function • Many services add complexity and inter-call latencies
  • 26. ORCHESTRATION • Responsible for interoperation • Also to make a client perceive a single system • Can become extremely complex • Proper frameworks need to emerge • Some people move their complexity here!
  • 27. SERVICE DISCOVERY • Directory of services, registering and finding them • Sounds pretty much like DNS, right? • Hard to make DNS highly available • DNS was mostly designed for standard ports • DNS was not optimised for real-time changes
  • 28. TESTING MICROSERVICES • Most of the times, unit testing adds small value • You need to do integration tests and load tests • Testing network connectivity problems can get tricky • Most of the times you need end-to-end tests! Hard to maintain!
  • 29. COMMUNICATION BETWEEN SERVICES Async Workers Offline Workers Queue Message Bus ProtobufThrift JSON
  • 30. Proxies ELB Server Database Booking Cinema Validation Audit Stats Seating Fraud Payment Users Mails Analytics
  • 31. TIMEOUTS AND RETRIES: MADNESS! • Should we always retry on failure? Idempotence problem • Servers down? Database locks? Database down? Network issue? • We can have 2 timeouts: establishing connection but also waiting for a response.And we have a chain of those!
  • 32. OTHER PROBLEMS • One microservice can take all your system down • Health-checks, constant monitoring and instrumentation • Availability goes down in a chain of calls (0.99^3 = 0.97) • It’s up to the team to decide in every situation / problem if breaking into microservices is worth the hassle
  • 33. DATA ANALYSIS All businesses need it!
  • 34. QUERYVS PROCESSING • SQL is great because we can query by any field • There is no standard in NoSQL databases • NoSQL systems are more limited, only keys (some allow secondary indexes) or complex graph syntax • We sometimes need processing for complex queries
  • 35. MAP-REDUCE
  • 36. HADOOPVS SPARK • Techniques to extract subsets of the data (MAP) and operate them in parallel before aggregating (REDUCE) • Not real time, Hadoop the most popular • Apache Spark opens a new paradigm for near real-time • You need other languages for these techniques
  • 37. REALTIME? • Pseudo real-time (up to 60 seconds) is usually enough • Unless you are building a video game with social features :) • What if we have a distributed application between regions? • Do we prefer latency to one region or the replication hassles?
  • 38. FIREHOSE • Twitter: real-time stream of tweets • Technologies like Kafka, Amazon Kinesis, RabbitMQ or NSQ allow us to create a firehose of events from our system to process • BeVERY careful with the different trade-offs
  • 39. MODERN STREAM PROCESSING Service CService BService A Service D Credits to @alexanderdean from Snowplow Email MKT CRM Analytics 3rd party Unified log Eventstream Keeping few days Low latency Streaming APIs / web hooks Own data center High latency Archive “Big” Data Hadoop Spark Workers “Live” analysis Monitoring Low latency Pseudo real-time API
  • 40. WHAT ABOUT US? Is there any hope?
  • 41. PHP • Libraries for everything • Community and documentation • Created for the web • Easy to scale horizontally • Facebook, Etsy,Youporn,Yahoo… • Slow • Language weirdos and WTFs • Lack of threading • Not great reputation • Catching up slowly
  • 42. DON’T USE PHP! • Heavy math calculations, massive data processing • Long-running CLI processes, queue workers • Intensive threading, forking, message passing • High concurrency scenarios with no cacheable requests • Not ideal either for writing DSLs or using websockets
  • 43. ERLANG • Extreme lightweight / concurrency • Maturity / Stability /Tooling • OTP framework • Hot code swaps • RabbitMQ, Riak,Whatsapp, Ejabberd… • A bit alien syntax • Steep learning curve • String processing • Hard to find developers
  • 44. GOLANG • Lightweight / easy concurrency • Feasible learning curve • Easy deploy, fast compile • Hype and momentum • Docker, NSQ, used by many… • Immature libraries • No exceptions, error bubbling • Hard to recover panics in go-routines • Not great for business logic
  • 45. SCALA • Runs on the JVM • Mature, stable, libraries, fast • Tooling • Big hype in the Java community • Akka, SBT, Play Framework… • Weird objects model • Perhaps too much magic • Incompatibilities between minor versions • Slow compilation
  • 46. LOOK BEYOND PHP It will surely make you a better developer
  • 47. THE FUTURE • The next decade will be fascinating for the industry • The internet of things will bring new challenges • Architectures… and life… are full of tradeoffs • PHP is not enough to address these needs • But… PHP is not going anywhere
  • 48. DON’T BLINDLY FOLLOWTHETRENDS
  • 49. READ CAREFULLYTHE DOCS
  • 50. CHOOSETHE RIGHTTOOL
  • 51. “A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: a complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a simple system” John Gall, systems theorist
  • 52. QUESTIONS? •Twitter: @ricardclau •E-mail: ricard.clau@gmail.com •Github: https://github.com/ricardclau •Please rate the talk at https://joind.in/13372