Uptime May 2018

Uptime report for the past month:

api.rehtumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/345532

dashboard.rethumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/346471

rethumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/346472


Main Events

May 15, 2018 - Object Store Incidents

We had anoter issue with our main object store provider. As usual the system uses the backup service but now we can do a fast switch and start using a new primary service without downtime.

Incident:

https://status.digitalocean.com/incidents/ql22cg0mzsj4

Our Twitter message:


Notes

  • We migrate all our queues from Beanstalk to RabbitMQ. In the future we would like to write a blog post with our experience and why the made the change.

Uptime April 2018

Uptime report for the past month:

api.rehtumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/345532

dashboard.rethumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/346471

rethumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/346472


Main Events

Apr 24, 2018 - Object Store Incidents

During one hour we had a issue with our primary Object Store provider. The system continued to work using the backup service. Although we would prefer a fast switch between providers instead of a fall-back when the first one fails – this is being addressed in the upcoming release: v67.

Incident:

https://status.digitalocean.com/incidents/fhspl6w8yp1b

Our Twitter message:


Notes

  • On-going effort: we are in the process of migrating to a 100% container-based infrastructure in order to have more flexibility and improve our scalability response - more on this soon.

Uptime March 2018

Uptime report for the past month:

api.rehtumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/345532

dashboard.rethumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/346471

rethumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/346472

Main Events

No main events.

Our API is now stable without any event in the past weeks.

Notes

  • We are in the process of migrating to a 100% container-based infrastructure in order to have more flexibility and improve our scalability response - more on this soon.

Uptime February 2018

Uptime report for the past month:

api.rehtumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/345532

dashboard.rethumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/346471

rethumb.com
http://www.uptimedoctor.com/publicreport/vi083t3k/79788/1/346472

Main Events

Some details on the main events of the past month.

api.rehtumb.com performance issues
  • Our main API servers received a Spectre and Meltdown update but after the reboot the performance was 10x worse. This cause notorious issues with the handling of the requests and we had to move the servers to new hardware. We changed some of the servers to a new provider in order to avoid having all the servers on the same location.

  • Note: the slow performance is not directly related with the patch. Other machines also received the update and didn’t suffer any impact.

dashboard.rethumb.com down for 03hr 15min
  • Our dashboard suffered a major impact during the migration issues that we had in the past month. During the migration described above we also moved our dashboard to new hardware and a new location. The 3h downtime was mainly due to the poor CPU performance after the patch.

Final Notes

  • Some of our machines are now running in Europe instead of NY. This won’t have any major impact on our final users.

  • Our monitor on the api.rethumb.com is hitting the CloudFlare cache. We will add new monitors with statistics from our own servers without the cache in front.

Outage 14/Feb/2018

At the moment rethumb is having some performance issues, this is being addressed with our VPN provider.

The root cause is related with software upgrades to mitigate the Meltdown and Spectre issues.

Updates

17/Feb/2018

We have decided to start migrating our infrastructure to new providers.

18/Feb/2018

First phase of the infrastructure migration in now done. We will continue to work to migrate the remaining machines.

We expect to have the migration done later today. Until then it is expected to have some slow down when processing new images.

19/Feb/2018

Our infrastructure migration is now complete and the system is stable.

After this episode we will take some measures to prevent these issues in the future.

We will also take some additional measures such as:

  • Create a public dashboard with current service status.
  • Use our Twitter account to publish details about outages.
  • Use our blog to report outages and on-going efforts to mitigate them.

Release v66

Starting today we will have a new post on every new release of a new API version. These posts aim to share internal changes, bugfixes and new features with each new release of rethumb.

Relase: v66
Date: 06/Feb/2018


#1 Bugfix

Fixed the fallback to original image when the system can’t process the user request and had to send back the original image instead of a processed one.

To configure the timeout behabiour users can access the “Source section > Timeout Action” in the Dashboard.

Short-term Roadmap 2

It is time to share our roadmap (again). These are the features that will get our attention in the upcomming months.


Progressive JPGs

Add a new image format: Progressive JPGs. At the moment our JPGs are interlaced but for some cases progressive can achive better results.


Quality Parameter

Add a new parameter to allow our users to choose the final image quality.


Image optimization

Add a new operation to apply automatic image optimization in order to reduce filesize but keeping the same visual quality.


Keep GIF animations

Add a new parameter to keep GIF animations. At the moment we create a thumbnail from the first frame, although we would like to give the possibility to create a thumbnail from the complete GIF.

HTTPS for Custom Domains

We have added HTTPS support for our CNAME records for free.

All our CNAME records now came with a free HTTPS certificate by Let’s Encrypt. These are trusted by all major browsers and allow our users to have a full encrypted site.

When using our service directly on a site you might want a CNAME record to keep everything on the same domain, but without HTTPS support from us that would be an issue, leading the browser to notify your users that some parts are not encrypted.

Now we automatically provide HTTP and HTTP for all your CNAME records that you register with us with no additional fee or limit.

Our own domain api.rethumb.com sill operates with http:// and https://.

Cloudflare Integration

Since last June we have enabled CloudFlare CDN for our API domain: api.rethumb.com.

With this change we can leverage on the amazing CloudFlare network (with over 80 delivery points) and give our users best performance and security. Our API is a highly cacheable endpoint, so it makes sense to cache all our data near our final users.

How this work? CloudFlare stands between your user browser and our own servers. When a user requests an image for the first time CloudFlare passes the request to us. We then process the image and return the final image. CloudFlare will cache this image and subsequent requests won’t hit our servers (being served by CloudFlare alone). Virtually 100% of our requests are cache hits so this will bring a significative speed-up for our final users.

We are also better protected against DDoS attacks and even when our all servers are down we can operate on some level.

Note: clients using CNAME records won’t have access to this feature – they will have to enable CloudFlare CDN for their domain.

Tutorials in Kotlin programming language

If you want to use rethumb with Kotlin check these tutorials: