Persisting data in modern Web-Apps
LocalStorage, SessionStorage, IndexedDB, Service Worker, Filesystem API and of course Cookies 🍪!
It wasn't long ago when cookies were the only viable option to store data persistently in the browser. Speaking of cookies, we talk about a limited string of characters representing the storage capabilities for quite a long time. But technology has evolved, and so have the browsers. As a result, today's options to temporarily store data or persist in the browser are far better than some years ago and will eventually become useful for application development.
Modern Web-Apps or PWAs aim to compete with native apps and websites; to accomplish that, the Web-App needs to be reliable, performant, and persist even higher amounts of data like text blobs or images. Therefore, the new available Filesystem API1, adopted by Apple WebKit, will also be a game-changer.
The following article will discuss the different approaches to storing data as temporary and persistent for websites and web apps.
Cookies - The secure grandfather of storage
Cookies are key-value pairs stored in a maximum of 4KB size string. The 4KB limit generally refers to the maximum size of each cookie, including the cookie's name, value, and attributes (such as expiration date, domain, path, etc.). This limit is a per-cookie restriction.
Cookies are mainly used for three applications:
Token Storage for Authentication or Session-Identification: Cookies are frequently stored to store session or authentication tokens. When a user logs into a website, the server sends a session token back to the user's browser, which is stored as a cookie. This cookie is then sent back to the server with each subsequent request, allowing the server to recognize the user and maintain their session. This is fundamental to maintaining stateful interactions in the otherwise stateless HTTP protocol.
Tracking Cookies for Monitoring or Marketing Purposes: Tracking cookies monitor user behavior across websites. They are commonly used by advertisers and analytics services to gather data on user activities, preferences, and browsing histories. This data helps in creating targeted advertising and understanding user behavior patterns. For example, a tracking cookie set by an advertisement on one site can track user actions on other sites that display ads from the same network.
Storing Simple App States in a Primitive Way: Cookies can store simple, non-sensitive data locally in the user's browser. This could include user preferences, such as layout choices, color themes, or other information that contribute to the user experience but are not critical. Since cookies are sent with every HTTP request to the server, this use should be limited to small amounts of data to avoid unnecessary network traffic.
🍪 Cookies are still viable because of their secure nature and because I use them as my emoji.
Of course, you cannot store a lot, but you can safely do it in a browser environment and remove access from the JavaScript scope. The security flag can accomplish this httpOnly2 gain, which will remove access from the cookie and only pass it in the request header to the server via a secured HTTPS connection.
Another security flag is the SameSite directive, which shall help prevent cross-site request forgery attacks (CSRF3) by restricting the sending of the cookie only to its originated site.
Caching with Service Worker
Caching in the browser is an old technique to reduce requests to the server and, therefore, reduce payload while increasing the overall user experience. Unfortunately, this basic caching has limitations as soon as the device connectivity is gone, and the developer doesn't have enough control over what to cache.
With the introduction of the service-worker4 came the capability to implement runtime caching strategies5 6 for specific routes in your app. It's possible to cache and persist all network traffic of your web app and reuse it with a Stale-While-Revalidate (SWR) approach.
This approach stores all requests in the cache storage, and with the next call of the Web App, the service worker, which acts as a proxy, will provide stale data and decide when to re-fetch by the current caching strategy.
This basic technique provides offline experience and reliability, some fundamentals of progressive web apps.
In Google Workbox, available strategies
Stale-While-Revalidate: This strategy serves content from the cache (stale) while simultaneously fetching an updated version from the network. The updated content is then cached for the next use. This approach is useful for assets where immediate speed is critical, but fresh content is also important. For example, a news website might use this for articles, ensuring quick loading while updating the latest content.
👉 Balanced Approach
Network-First: This strategy attempts to fetch the latest content from the network first. If the network request fails (perhaps due to a lack of internet connectivity), it returns to a cached response. It's ideal for dynamic content that changes frequently and needs to be as up-to-date as possible, such as live sports scores or stock market data.
👉 Fresh-First approachCache-First: With this approach, the service worker checks the cache first and serves the response from there if it’s available. Only if the resource is not in the cache does it fetch from the network. This is great for static, unchanging assets like logos, CSS files, and JavaScript libraries. It ensures quick load times and minimizes network traffic.
👉 Static approach: when things don’t change
Network-Only: This strategy bypasses the cache entirely and always fetches the resource from the network. It’s the equivalent of not implementing a caching strategy in the service worker. It's suitable for dynamic or sensitive information where updates are frequent, and caching could present outdated or incorrect data.
👉 No-Caching approach
Cache-Only: This strategy serves responses exclusively from the cache without using the network. If the requested resource isn’t in the cache, it results in an error. This approach is used for scenarios where the application has to work offline, like a PWA, after it has precached its necessary resources during installation or activation.
👉 No online approach
The Service-Worker is not meant to store and retrieve specific data on demand; it is intended to be a passive persistor of network traffic and already received data. A good example is a map app that downloads tiles once. It makes them available in the cache for reliable offline usage or news app caching already downloaded for offline use later.
Caching is becoming more critical since mobile applications are limited to their current network, which might be of low bandwidth, poor quality, or unavailable when requesting new data. Still, the apps shall show no error code or failed state; the user journey must continue smoothly. This behavior is widely standard for native apps and should be adopted by the web using caching API.
Indexed DB - Asynchronous go-to storage
To store on-demand text-based data in the browser, it's recommended to use indexedDB, as long as there's no sensitive data. With IndexedDB, the browser provides key-value pair-based databases with a basic interface setItem, getItem, removeItem.
However, it's important to mention that those actions are asynchronous and must be utilized in JavaScript. In detail, IndexedDB API is more complicated than that, but solid promise wrappers like localForage or IDB are available.
Unlike its synchronous cousin, LocalStorage, IndexedDB provides multiple databases with individual versions. So, it's possible to structure your data in a bucket approach, which is recommended anyway.
Quota-Limitations of storage
With Indexed DB comes the limitations of the local storage of the browser. This limitation can differ from browser to browser and even the OS. The exact quota limitations are a topic of its own and ever-changing. However, it has improved in the last few years and is very different on each OS and browser.
Even the installation state (or “Add to Homescreen” on iOS) affects what the app can store. I strongly recommend you keep yourself updated if the quota is mission-critical for your company or customer.
Interestingly, the installed version of a WebApp or PWA on Apple Safari receives separate storage and quotas, even if it is the same domain. The installation state also affects data persistence. The installed version shares the same storage on Android as the browser version.
In 2022: Android Chrome lets you store around 60-80% of your free disk space (in the private mode, it's significantly lower). At the same time, Apple Safari raised its limit from 500MB to 1GB of data, and users will be prompted to raise the limit by 200MB steps with the user's permission.
In 2024: As of macOS 14 and iOS 17, Safari allocates up to 20% of the total disk space for each origin's storage. If the user saves a web app on their Home Screen or Dock, this limit increases to up to 60% of the disk size. Additionally, Safari enforces an overall limit where stored data across all origins cannot exceed 80% of the disk size for each browser and web app and 15% for non-browser apps displaying web content. These limits are designed to prevent fingerprinting, so the available quota might vary.
LocalStorage - Looks better than cookies but should be avoided these days.
The synchronous LocalStorage feature looks great and is easy to use. But it has its downsides. It's synchronous, blocking the main thread, and not recommended. This type of storage can work for a single string, but Storing more significant amounts of data is a huge problem and a no-go. In addition to that, the quota is relatively tiny at 5MB.
Security is the second reason, but this point goes alongside IndexedDB and Session Storage. These storages are always accessible through Javascript and not secure through a httpsOnly flag.
🚨So you should never store sensitive data like JWT in that store with that in mind.
Session Storage - Sibling of LocalStorage with dementia
This storage type works like LocalStorage, but its lifetime is bound to the open tab in the browser or closing the PWA. A session can vary from browser to browser. The definition for a session can be browser-wide, with all open tabs for the same domain or a single tab. This storage is also limited to 5MB of data; therefore, it's not meant to be used for files.
Filesystem API - Durable file-based storage, now available on Safari as well 🚀
I read a WebKit blog post, The File System Access API with Origin Private File System, in 2022. The post flashed me because Apple was not supporting the PWA idea too much, but after some years of waiting, it finally started to move forward again.
Since then, the WebKit team has been constantly moving towards PWA, which made me quite happy.
We adopted the Filesystem API early in the game. Unfortunately, it was removed again in a version, and our PWAs fell back to the IndexedDB fallback we created.
For me, the biggest advantage is the real resistance. Sometimes, apps need to be re-installed. Removing a PWA or a website in general from your system removes the traditional persistent data, like IndexedDB, local storage, etc.
🍪 It’s different with the filesystem API. The data is persistent until the user intentionally deletes it, and re-installing the app does not lead to data loss. That was important for production purposes regarding PWAs, which have captured “golden data” that cannot be lost.
What Filesystem seems to do better than IndexedDB
Storing large blobs of data up to several megabytes frequently was possible with IndexedDB, but it has downsides. IndexedDB seems to have issues with many transactions in a short amount of time, while the payload to read or write is high. We monitored many quota issues with Sentry.io on Apple devices in one specific app, which we could reproduce with tools like local storage abuser 7.
Instead of providing a prompt for the user to extend the current quota, the database crashed; in some cases, restart restored the persisted data after restarting the webview; in others, the data was lost entirely.
How we solved the problem without a stable Filesystem API available
Those problems occurred predominantly on the older iPhone 11 generation with iOS 15. We mitigated the problem by reducing the database stress regarding the number of requests and payload.
Additionally, we substituted packages like ionic capacitors for photo capturing to have more control over I/O operations and replaced them with our own implementations according to web standards.
Hardening the persistence layer with Filesystem API
To avoid that problem entirely, we started implementing the Filesystem API into our Progressive Web Apps. This was possible for us since iOS 15.4 or macOS 12.3, when the WebKit team implemented the API in a usable way for production-grade apps.
The Filesystem storage is persistent for real.
Clearing the website data will delete the other storage options after removing it; it won't happen with the Filesystem. By this, persisted files are harder to delete by accident; inexperienced users can keep their data safe, even with improper app handling. (So, the theory! :) )
🍪 Golden Data is a term for business apps with mission-critical data, like photos or JSON data, which could only be captured or created in a specific timeframe and a particular place with the app. Thus, the data cannot be restored and must stay safe on the device until the app can upload the data to the servers. We used to accomplish that with IndexedDB, and we had many problems there, especially with the two main caveats I already mentioned. With the Filesystem, the data still exists even if the applications need to be re-installed, as long as the user doesn't delete it on purpose.
Synchronous and asynchronous Filesystem API
The Filesystem can be accessed via file handles. This can be done asynchronously in the webview or synchronously within a web worker 8. The latter approach is meant to be more performant but, by the nature of a web worker, more complicated to set up.
Within a worker, thread access handles are required to get file handles, which was implemented as a layer of security. The handle has an owner, and once this owner is passed, the current worker or script can no longer do anything with the handle.
Unfortunately, the web worker implementation is only available on Safari when writing this article. (Yep, Apple implemented it, and Google doesn't yet. I appreciate that!)
The asynchronous webview approach is easier to access, especially when reading many files to display them as thumbnails. You don't need to ensure the web worker is running; you can access the stored data via promises.
Both ways are relevant, and the developer should understand the approaches before working with them in production. Web workers can also be complex when testing (Test-Drive-Development or Test-Last-Development). If you are interested in those things, you should prepare carefully here.
The primary browser developer tools currently lack tools to interact with the Filesystem. We first developed a basic filesystem explorer based on our varied experience debugging remote devices. With this tool, we confirmed that the Filesystem API works stably.
We will be early adopters.
The Filesystem API is usable but not fully implemented with every security aspect and detail. I greatly appreciate the current development; in my opinion, it is a huge milestone and game-changer, especially for PWAs. We will use this API as an early adopter in production on the major platforms.
Since Progressive Web Apps are progressive, we are developing a filesystem wrapper that falls back on IndexedDB if the new API isn't implemented. This approach feels solid and promising, and I will report back in the future.
What are Progressive Web Apps (PWA)?
How does a company or start-up decide on what to put in the limited budget? In this article, I share my opinion on why web apps have become a suitable solution. Start-ups and small companies usually face the same challenges: Creating a digital solution to solve a problem in the market with limited resources. However, when accomplishing the business plan,…
Thanks for reading,
Adrian,
stackableCTO, Mentor, and CTO webbar.dev
https://webkit.org/blog/12257/the-file-system-access-api-with-origin-private-file-system/#:~:text=The%20File%20System%20API%20makes,directories%20and%20enumerate%20their%20contents.
https://owasp.org/www-community/HttpOnly
https://de.wikipedia.org/wiki/Cross-Site-Request-Forgery
https://web.dev/learn/pwa/service-workers
https://web.dev/articles/runtime-caching-with-workbox
https://developer.chrome.com/docs/workbox/caching-resources-during-runtime
https://demo.agektmr.com/storage/
https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers