applications/photo-asset-manager/DESIGN.md
Photo Asset Management went through a typical engineering design process. As an educational team, it began by brainstorming possible applications that would show using Rekognition & S3 lifecycles in a realistic scenario. The team selected a user persona and critical user journeys that would benefit from those services. The team then made engineering design and architectural decisions driven by the journeys. This is a lightly edited summary of the design decisions that were made while developing PAM.
Dan is a casual photographer (shooting in jpeg) who focuses on nature photography. He also takes some ad-hoc photos of his friends and family. He wants a website where he can upload all of his photos, store them indefinitely, and download bundles of images that match nature-related tags (“forest”, “lake”, “mountain”, etc). Dan is the end user of this application.
Dan visits PAM and completes “upload photos” flow. Dan sees a loading spinner while the photos are being analyzed by Rekognition. When analysis completes, the UI shows a list of tags & a count of photos with each tag. Dan selects the tag “mountain (32)”, adds his phone number or email address, and clicks Download. Dan later receives a message (text or email) with a link to a zip file containing his images. (The link is only valid for a certain amount of time.)
aws s3 sync s3://njogis-imagery/ s3://${STORAGE_BUCKET}This is an ASCII sketch of a wireframe.
(Upload images) (Import Bucket)
Tags
[ ] Mountain (32)
[ ] Lake (27)
[ ] Clouds (18)
[Phone Number|Email] (Download)
Select tags → Click (Download) → Start User Story 3
Upload Images → <input type=“file” multiple /> to select images & Upload over form
~Import Bucket → [Bucket Name] (Copy) → Import jpegs from that button~
This example will be entirely serverless first. While cross service apps have historically assumed a locally running monolith, the asynchronous nature of restoring from Glacier zipping a large number of files necessitates a change in architectural approach. Because notifications can’t reach back to the customer’s locally running ephemeral instance, this example must have a deployed instance “somewhere”.
Because the app isn’t intended as a real-time or latency sensitive workload, it defies frugality to leave an entire EC2 instance or ECS stack around to serve its requests. While this could be mitigated with auto scaling groups set to zero, it would incur significant management overhead deciding when to scale in to 1 instance. A serverless deployment with faster cold start and automated scale in is preferable. Lambda and Fargate are the two primary offerings. Lambda requires custom configuration & build steps via CFN, SAM, or CDK; has single-function handlers; and fits a cloud-first mental model. Fargate requires Docker containers for all applications and is lightly heavier. Technically, these solutions are of similar complexity and cost, just exposing the complexity in different ways.
This team decided based on coin toss to use Lambda.
| Pros | Cons | Wash | |
|---|---|---|---|
| Lambda | Cloud-first mental model | N independent lambdas | Cost |
| Exciting and new | Managing library layers | ||
| Having a "real" non-trivial example | Difficult to run & debug locally | ||
| Fargate | Monolith HTTP middleware | Heavy weight containers | Cost |
| "Lift & Shift" from EC2 | "Boring" | ||
| Traditional debugging tools |
Two-click deployment - one for general resources and one for language specific portions. Stretch goal: one click (One stack for common resources, N stacks for each language). Recommend languages use one layer for all functions, and use the Function Configuration “Handler” to choose between them per-function.