Planet Varnish

  • 2015-09-19 03:00 Kristian Lyngstøl

    Posted on 2015-09-19

    I recently went back to working for Redpill Linpro, and thus started working with Varnish again, after being on the side lines for a few years.

    I've been using Varnish since 2008. And a bit more than just using it too. There's been a lot of great change over time, but there are still things missing. I recently read http://kacper.blog.redpill-linpro.com/archives/743 and while I largely agree with Kacper, I think some of the bigger issues are missing from the list.

    So here's my attempt to add to the debate.

    TLS/SSL

    Varnish needs TLS/SSL.

    It's the elephant in the room that nobody wants to talk about.

    The world is not the same as it was in 2006. Varnish is used for more and more sensitive sites. A larger percentage of Varnish installations now have some sort of TLS/SSL termination attached to it.

    TLS/SSL has been a controversial issue in the history of Varnish Cache, with PHK (Principal architect of Varnish Cache - https://www.varnish-cache.org/docs/trunk/phk/ssl.html) being an outspoken opponent of adding TLS in Varnish. There are valid reasons, and heartbleed has most certainly proven many of PHK's grievances right. But what does that matter when we use TLS/SSL anyway? It's already in the stack, we're just closing our eyes to it.

    Setting up nginx in front of Varnish to get TLS/SSL, then nginx behind Varnish to get TLS/SSL... That's just silly. Why not just use nginx to cache then? The lack of TLS/SSL in Varnish is a great advertisement for nginx.

    There are a lot of things I dislike about TLS/SSL, but we need it anyway. There's the hitch project (http://hitch-tls.org), but it's not really enough. We also need TLS/SSL to the backends, and a tunnel-based solution isn't enough. How would you do smart load balancing through that? If we don't add TLS/SSL, we might as well just forget about backend directors all together. And it has to be an integral part of all backends.

    We can't have a situation where some backend directors support TLS/SSL and some don't.

    Varnish Software is already selling this through Varnish Cache Plus, their proprietary version of Varnish. That is obviously because it's a deal breaker in a lot of situations. The same goes for basically any serious commercial actor out there.

    So we need TLS/SSL. And we need it ASAP.

    Note

    After speaking to PHK, let me clarify: He's not against adding support for TLS, but adding TLS itself. Varnish now supports the PROXY-protool which is added explicitly to improve support for TLS termination. Further such additions would likely be acceptable, always doing the TLS outside of Varnish.

    Better procedures for VCL changes

    With every Varnish version, VCL (The configuration language for Varnish) changes either a little bit, or a lot. Some of these changes are unavoidable due to internal Varnish changes. Some changes are to tweak the language to be more accurate (e.g. changing req.request to req.method, to reflect that it's the request method).

    If Varnish is part of your day-to-day work, then this might not be a huge deal. You probably keep up-to-date on what's going on with Varnish anyway. But most users aren't there. We want Varnish to be a natural part of your stack, not a special thing that requires a "varnish-admin".

    This isn't necessarily an easy problem to solve. We want to be able to improve VCL and get rid of old mistakes (e.g., changing req.request to req.method is a good thing for VCL). We've also changed the way to do error messages (or custom varnish-generated messages) numerous times. And how to create hitpass objects (a complicated aspect of any cache).

    A few simple suggestions:

    All VCL changes reviewed in public as a whole before the release process even starts. To avoid having to change it again two versions down the line. Backward compatibility when possible. With warnings or even requiring an extra option to allow it. E.g.: req.request could easily still work, there's no conflict there. Never for forever, but perhaps to the end of a major version. Not everything will be backwards compatible, but some can.

    I've had numerous complaints from highly skilled sysadmins who are frustrated by this aspect of Varnish. They just don't want to upgrade because they have to do what feels like arbitrary VCL changes every single time. Let's see if we can at least REDUCE that.

    Documentation?

    There's a lot of documentation for Varnish, but there's also a lot of bad documentation. Some issues:

    People Google and end up on random versions on varnish-cache.org. No, telling people "but there's a version right there so it's your own fault!" is not an acceptable solution. Varnish Software them self recently had a link in their Varnish Book where they used a link to "trunk" instead of "4.0", whereupon the "here is a complete list of changes between Varnish 3 and Varnish 4" link was actually a link to changes betwen Varnish 4.0 and the next version of Varnish.

    "user guide" and "tutorial" and "installation"? Kill at least two and leave the others for blog posts or whatever. Hard enough to maintain one with decent quality.

    Generated documentation needs to be improved. Example:

    Prototype STRING fileread(PRIV_CALL, STRING) Description Reads a file and returns a string with the content. Please note that it is not recommended to send variables to this function the caching in the function doesn't take this into account. Also, files are not re-read. Example set beresp.http.served-by = std.fileread("/etc/hostname");

    PRIV_CALL should clearly not be exposed! Other examples are easy enough to find.

    In addition, the Description is a mixture of reference documentation style and elaboration. Reference documentation should be clearly separated from analysis of consequences so technical users don't have to reverse-engineer a sentence of "don't do this because X" to figure out what the code actually does.

    And where are the details? What happens if the file can't be opened? What are the memory constraints? It says it returns the content of the file as a string, but what happens with binary content? There's clearly some caching of the file, but how does that work? Per session? Per VCL? Does that cache persist when you do varnishadm stop; varnishadm start? That's completely left out.

    Rants mixed in with documentation? Get rid of "doc/shpinx/phk" (https://www.varnish-cache.org/docs/4.0/phk/) and instead reference it somewhere else. Varnish-cache.org/doc should not be a weird blog-space. It clutters the documentation space. Varnish is not a small little project any more, it's grown past this.

    VMOD packages

    Varnish vmods are awesome. You can design some truly neat solutions using Open Source vmods, or proprietary ones.

    But there are no even semi-official package repositories for the open source vmods. Varnish Software offers this to customers, but I really want it for the public too. Both for my own needs, and because it's important to improve Varnish and VMOD adaption.

    Until you can do "apt-get install varnish-vmod-foo" or something like that, VMODS will not get the attention they deserve.

    There are some projects in the works here, though, so stay tuned.

    TLS/SSL

    In case you missed it, I want TLS/SSL.

    I want to be able to type https://<varnish host>

    BTW: Regarding terminology, I decided to go with "TLS/SSL" instead of either "SSL" or "TLS" after some feedback. I suppose "TLS" is correct, but "SSL" is more recognized, whether we like it or not.

    Comments
    by Planet Varnish at 2015-09-19 03:00
  • 2015-08-26 11:48 hildur@varnish-software.com
    We are very pleased to have been included in Forrester Research’s recent “CDN And Digital Performance Vendor Landscape, Q3 2015”  report by analysts Mark Grannan with Ted Schadler and Kevin Driscoll. 
    by Hrafnhildur Smaradottir at 2015-08-26 11:48
  • 2015-08-23 22:59 kacper

    I’ve been meaning to write a blog entry about Varnish for years now. The closest I’ve come is to write a blog about how to make Varnish cache your debian repos, make you a WikiLeaks cache and I’ve released Varnish Secure Firewall, but that without a word on this blog. So? SO? Well, after years it turns out there is a thing or two to say about Varnish. Read on to find out what annoys me and people I meet the most.

    Although you could definitely call me a “Varnish expert” and even a sometimes contributor, and I do develop programs, I cannot call myself a Varnish developer because I’ve shamefully never participated in a Monday evening bug wash. My role in the Varnish world is more… operative. I am often tasked with helping ops people use Varnish correctly, justify its use and cost to their bosses, defend it from expensive and inferior competitors, sit up long nites with load tests just before launch days. I’m the guy that explains the low risk and high reward of putting Varnish in front of your critical site, and the guy that makes it actually be low risk, with long nites on load tests and I’ll be the first guy on the scene when the code has just taken a huge dump on the CEO’s new pet Jaguar. I am also sometimes the guy who tells these stories to the Varnish developers, although of course they also have other sources. The consequences of this .. lifestyle choice .. is that what code I do write is either short and to the point or .. incomplete.

    I know we all love Varnish, which is why after nearly 7 years of working with this software I’d like to share with you my pet peeves about the project. There aren’t many problems with this lovely and lean piece of software but those which are there are sharp edges that pretty much everyone snubs a toe or snags their head on. Some of them are specific to a certain version, while others are “features” present in nearly all versions.

    And for you Varnish devs who will surely read this, I love you all. I write this critique of the software you contribute to, knowing full well that I haven’t filed bug reports on any of these issues and therefore I too am guilty in contributing to the problem and not the solution. I aim to change that starting now :-) Also, I know that some of these issues are better lived with than fixed, the medicine being more hazardous than the disease, so take this as all good cooking; with a grain of salt.

    Silent error messages in init scripts

    Some genious keeps inserting 1>/dev/null 2>&1 into the startup scripts on most Linux distros. This might be in line with some wacko distro policy but makes conf errors and in particular VCL errors way harder to debug for the common man. Even worse, the `service varnish reload` script called `varnish-vcl-reload -q`, that’s q for please-silence-my-fatal-conf-mistakes, and the best way to fix this is to *edit the init script and remove the offender*. Mind your p’s and q’s eh, it makes me sad every time, but where do I file this particular bug report?

    debug.health still not adequately documented

    People go YEARS using Varnish without discovering watch varnishadm debug.health. Not to mention that it’s anyone’s guess this has to do with probes, and that there are no other debug.* parameters, except for the totally unrelated debug parameter. Perhaps this was decided to be dev-internal at some point, but the probe status is actually really useful in precisely this form. debug.health is still absent from the param.show list and the man pages, while in 4.0 some probe status and backend info has been put into varnishstat, which I am sure to be not the only one being verry thankful for indeed.

    Bad naming

    Designing a language is tricky.

    Explaining why purge is now ban and what is now purge is something else is mindboggling. This issue will be fixed in 10 years when people are no longer running varnish 2.1 anywhere. Explaining all the three-letter acronyms that start with V is just a gas.
    Showing someone ban("req.url = "+ req.url) for the first time is bound to make them go “oh” like a racoon just caught sneaking through your garbage.
    Grace and Saint mode… that’s biblical, man. Understanding what it does and how to demonstrate the functionality is still for Advanced Users, explaining this to noobs is downright futile, and I am still unsure whether we wouldn’t all be better off for just enabling it by default and forgetting about it.
    I suppose if you’re going to be awesome at architecting and writing software, it’s going to get in the way of coming up with really awesome names for things, and I’m actually happy that’s still the way they prioritize what gets done first.

    Only for people who grok regex

    Sometimes you’ll meet Varnish users who do code but just don’t grok regex. It’s weak, I know, but this language isn’t for them.

    Uncertain current working directory

    This is a problem on some rigs which have VCL code in stacked layers, or really anywhere where it’s more appropriate to call the VCL a Varnish program, as in “a program written for the Varnish runtime”, rather than simply a configuration for Varnish.

    You’ll typically want to organize your VCL in such a way that each VCL is standalone with if-wrappend rules and they’re all included from one main vcl file, stacking all the vcl_recv’s and vcl_fetches .

    Because distros don’t agree on where to put varnishd’s current working directory, which happens to be where it’s been launched from, instead of always chdir $(basename $CURRENT_VCL_FILE), you can’t reliably specify include statements with relative paths. This forces us to use hardcoded absolute paths in includes, which is neither pretty nor portable.

    Missing default director in 4.0

    When translating VCL to 4.0 there is no longer any language for director definitions, which means they are done in vcl_init(), which means your default backend is no longer the director you specified at the top, which means you’ll have to rewrite some logic lest it bite you in the ass.

    director.backend() is without string representation, instead of backend_hint,
    so cannot do old style name comparisons, ie backends are first-class objects but directors are another class of objects.

    VCL doesn’t allow unused backends or probes

    Adding and removing backends is a routine ordeal in Varnish.
    Quite often you’ll find it useful to keep backup backends around that aren’t enabled, either as manual failover backups, because you’re testing something or just because you’re doing something funky. Unfortunately, the VCC is a strict and harsh mistress on this matter: you are forced to comment out or delete unused backends :-(

    Workarounds include using the backends inside some dead code or constructs like

    vcl_recv{ set req.backend_hint = unused; set req.backend_hint = default; ... }

    It’s impossible to determine how many bugs this error message has avoided by letting you know that backend you just added, er yes that one isn’t in use sir, but you can definitely count the number of Varnish users inconvenienced by having to “comment out that backend they just temporarily removed from the request flow”.

    I am sure it is wise to warn about this, but couldn’t it have been just that, a warning? Well, I guess maybe not, considering distro packaging is silencing error messages in init and reload scripts..

    To be fair, this is now configurable in Varnish by setting vcc_err_unref to false, but couldn’t this be the default?

    saintmode_threshold default considered harmful

    If many different URLs keep returning bad data or error codes, you might concievably want the whole backend to be declared sick instead of growing some huge list of sick urls for this backend. What if I told you your developers just deployed an application which generates 50x error codes triggering your saintmode for an infinite amount of URLs? Well, then you have just DoSed yourself because you hit this threshold. I usually enable saintmode only after giving my clients a big fat warning about this one, because quite frankly this easily comes straight out of left field every time. Either saintmode is off, or the treshold is Really Large™ or even ∞, and in only some special cases do you actually want this set to an actual number.

    Then again, maybe it is just my clients and the wacky applications they put behind Varnish.

    What is graceful about the saint in V4?

    While we are on the subject, grace mode being the most often misunderstood feature of Varnish, the thing has changed so radically in Varnish 4 that it is no longer recognizable by users, and they often make completely reasonable but devestating mistakes trying to predict its behavior.

    To be clear on what has happened: saint mode is deprecated as a core feature in V4.0, while the new architecture now allows a type of “stale-while-revalidate” logic. A saintmode vmod is slated for Varnish 4.1.

    But as of 4.0, say you have a bunch of requests hitting a slow backend. They’ll all queue up while we fetch a new one, right? Well yes, and then they all error out when that request times out, or if the backend fetch errors out. That sucks. So lets turn on grace mode, and get “stale-while-revalidate” and even “stale-if-error” logic, right? And send If-Modified-Since headers too, sweet as.

    Now that’s gonna work when the request times out, but you might be surprised that it does not when the request errors out with 50x errors. Since beresp.saint_mode isn’t a thing anymore in V4, those error codes are actually going to knock the old object outta cache and each request is going to break your precious stale-while-error until the backend probe declares the backend sick and your requests become grace candidates.

    Ouch, you didn’t mean for it to do that, did you?

    And if, gods forbid, your apphost returns 404′s when some backend app is not resolving, bam you are in a cascading hell fan fantasy.

    What did you want it to do, behave sanely? A backend response always replaces another backend response for the same URL – not counting vary-headers. To get a poor mans saint mode back in Varnish 4.0, you’ll have to return (abandon) those erroneous backend responses.

    Evil grace on unloved objects

    For frequently accessed URLs grace is fantastic, and will save you loads of grief, and those objects could have large grace times. However, rarely accessed URLs suffer a big penalty under grace, especially when they are dynamic and ment to be updated from backend. If that URL is meant to be refreshed from backend every hour, and Varnish sees many hours between each access, it’s going to serve up that many-hour-old stale object while it revalidates its cache.


    This diagram might help you understand what happens in the “200 OK” and “50x error” cases of graceful request flow through Varnish 4.0.

    Language breaks on major versions

    This is a funny one because the first major language break I remember was the one that I caused myself. We were making security.vcl and I was translating rules from mod_security and having trouble with it because Varnish used POSIX regexes at the time, and I was writing this really godaweful script to translate PCRE into POSIX when Kristian who conceived of security.vcl went to Tollef, who were both working in the same department at the time, and asked in his classical broker-no-argument kind of way "why don’t we just support Perl regexes?".
    Needless to say, (?i) spent a full 12 months afterwards cursing myself while rewriting tons of nasty client VCL code from POSIX to PCRE and fixing occasional site-devestating bugs related to case-sensitivity.

    Of course, Varnish is all the better for the change, and would get no where fast if the devs were to hang on to legacy, but there is a lesson in here somewhere.

    So what's a couple of sed 's/req.method/req.request/'s every now and again?
    This is actually the main reason I created the VCL.BNF. For one, it got the devs thinking about the grammar itself as an actual thing (which may or may not have resulted in the cleanups that make VCL a very regular and clean language today), but my intent was to write a parser that could parse any version of VCL and spit out any other version of VCL, optionally pruning and pretty-printing of course. That is still really high on my todo list. Funny how my clients will book all my time to convert their code for days but will not spend a dime on me writing code that would basically make the conversion free and painless for everyone forever.

    Indeed, most of these issues are really hard to predict consequences of implementation decisions, and I am unsure whether it would be possible to predict these consequences without actually getting snagged by the issues in the first place. So again: varnish devs, I love you, what are your pet peeves? Varnish users, what are your pet peeves?

    Errata: vcc_err_unref has existed since Varnish 3.

    by Planet Varnish at 2015-08-23 22:59
  • 2015-08-11 11:19 perbu@varnish-software.com
    If your content ever changes you’ll need some way to make sure the updated content reaches the users. The traditional way of doing this is to devise some sort of cache invalidation.
    by Per Buer at 2015-08-11 11:19
  • 2015-08-05 05:25 perbu@varnish-software.com
    Varnish Cache is versatile. To date we’ve seen it utilized as a website cache, API gateway/manager, API cache, CDN reverse proxy and a few others.
    by Per Buer at 2015-08-05 05:25
  • 2015-07-23 17:07 perbu@varnish-software.com
    Varnish is typically very busy, running several thousands of transactions per second. Combine this with the rather extreme verbosity of varnishlog, and you have a firehose of information that can be rather hard to manage. 
    by Per Buer at 2015-07-23 17:07
  • 2015-07-01 11:01 perbu@varnish-software.com
    Sometimes your web application needs to maintain state per session. This can cause problems when you are using a load balancer such as Varnish Cache. In order to mitigate this we need to make sure Varnish is fully aware of what is going on and that the sessions stick to the client. In practice, it means we need to make sure that a returning visitor gets the same application server every time.
    by Per Buer at 2015-07-01 11:01
  • 2015-06-26 16:09 ingvar

    The Varnish project has a new little free software baby arriving soon: Hitch, a scalable TLS proxy. It will also be made available with support by Varnish Software as part of their Varnish Plus product.

    A bit of background:

    Varnish is a high-performance HTTP accelerator, widely used over the Internet. To use varnish with https, it is often fronted by other general http/proxy servers like nginx or apache, though a more specific proxy-only high-performance tool would be preferable. So they looked at stud.

    hitch is a fork of stud. The fork is maintained by the Varnish development team, as stud seems abandoned by its creators, after the project was taken over by Google, with no new commits after 2012.

    I wrapped hitch for fedora, epel6 and epel7. Packages are available here: http://users.linpro.no/ingvar/varnish/hitch/ , including yum repo configurations. The default config is for a single instance of hitch.

    There is also a package review request for Fedora (bz #1235305). I will fork for EPEL as soon as the package is accepted.

    Note that there also exists as a fedora package of the (old) version of stud. If you use stud on fedora and want to test hitch, the two packages may coexist, and should be able to install in parallel.

    To test hitch in front of varnish, in front of apache, you may do something like this (tested on el7):

    Install varnish, httpd and hitch wget http://users.linpro.no/ingvar/varnish/hitch/el7/hitch.repo sudo cp hitch.repo /etc/yum.repos.d sudo yum install httpd varnish hitch Start apache sudo systemctl start httpd.service Edit the varnish config to point to the local httpd, that is, change the default backend definition in /etc/varnish/default.vcl , like this: backend default { .host = "127.0.0.1"; .port = "80"; } Start varnish sudo systemctl start varnish.service Add an ssl certificate to the hitch config. For a dummy certificate,
    the example.com certificate from the hitch source may be used:

    wget http://users.linpro.no/ingvar/varnish/hitch/default.example.com.pem sudo cp default.example.com.pem /etc/pki/tls/private/ Edit /etc/hitch/hitch.conf. Change the pem-file option to use that cert pem-file = "/etc/pki/tls/private/default.example.com.pem" Start hitch sudo systemctl start hitch.service Open your local firewall if necessary, by something like this: sudo firewall-cmd --zone=public --add-port=8443/tcp Point your web browser to https://localhost:8443/ . You should be greeted with a warning about a non-official certificate. Past that, you will get the apache frontpage through varnish and hitch.

    Enjoy, and let me hear about any interesting test results.

    Ingvar

    Varnish Cache is powerful and feature rich front side web cache. It is also very fast, that is, Fast as in on steroids, and powered by The Dark Side of the Force.

    Redpill Linpro is the market leader for professional Open Source and Free Software solutions in the Nordics, though we have customers from all over. For professional managed services, all the way from small web apps, to massive IPv4/IPv6 multi data center media hosting, and everything through container solutions, in-house, cloud, and data center, contact us at www.redpill-linpro.com.

    by Ingvar Hagelund at 2015-06-26 16:09
  • 2015-06-26 16:09 ingvar

    The Varnish project has a new little free software baby arriving soon: Hitch, a scalable TLS proxy. It will also be made available with support by Varnish Software as part of their Varnish Plus product.

    A bit of background:

    Varnish is a high-performance HTTP accelerator, widely used over the Internet. To use varnish with https, it is often fronted by other general http/proxy servers like nginx or apache, though a more specific proxy-only high-performance tool would be preferable. So they looked at stud.

    hitch is a fork of stud. The fork is maintained by the Varnish development team, as stud seems abandoned by its creators, after the project was taken over by Google, with no new commits after 2012.

    I wrapped hitch for fedora, epel6 and epel7, and submitted them for Fedora and EPEL. Please test the latest builds and add feedback: https://admin.fedoraproject.org/updates/search/hitch . The default config is for a single instance of hitch.

    The package has been reviewed and was recently accepted into Fedora and EPEL (bz #1235305). Update august 2015: Packages are pushed for testing. They will trickle down to stable eventually.

    Note that there also exists as a fedora package of the (old) version of stud. If you use stud on fedora and want to test hitch, the two packages may coexist, and should be able to install in parallel.

    To test hitch in front of varnish, in front of apache, you may do something like this (tested on el7):

    Install varnish, httpd and hitch sudo yum install httpd varnish sudo yum --enablerepo=epel-testing install hitch || sudo yum --enablerepo=updates-testing install hitch Start apache sudo systemctl start httpd.service Edit the varnish config to point to the local httpd, that is, change the default backend definition in /etc/varnish/default.vcl , like this: backend default { .host = "127.0.0.1"; .port = "80"; } Start varnish sudo systemctl start varnish.service Add an ssl certificate to the hitch config. For a dummy certificate,
    the example.com certificate from the hitch source may be used:

    wget http://users.linpro.no/ingvar/varnish/hitch/default.example.com.pem sudo cp default.example.com.pem /etc/pki/tls/private/ Edit /etc/hitch/hitch.conf. Change the pem-file option to use that cert pem-file = "/etc/pki/tls/private/default.example.com.pem" Start hitch sudo systemctl start hitch.service Open your local firewall if necessary, by something like this: sudo firewall-cmd --zone=public --add-port=8443/tcp Point your web browser to https://localhost:8443/ . You should be greeted with a warning about a non-official certificate. Past that, you will get the apache frontpage through varnish and hitch.

    Enjoy, and let me hear about any interesting test results.

    Ingvar

    Varnish Cache is powerful and feature rich front side web cache. It is also very fast, that is, Fast as in on steroids, and powered by The Dark Side of the Force.

    Redpill Linpro is the market leader for professional Open Source and Free Software solutions in the Nordics, though we have customers from all over. For professional managed services, all the way from small web apps, to massive IPv4/IPv6 multi data center media hosting, and everything through container solutions, in-house, cloud, and data center, contact us at www.redpill-linpro.com.

    by Planet Varnish at 2015-06-26 16:09
  • 2015-05-15 15:05 Lasse Karstensen

    The last couple of weeks we’ve been pretty busy making SSL/TLS support for Varnish Cache Plus 4. Now that the news is out, I can follow up with some notes here.

    The setup will be a TLS terminating proxy in front, speaking PROXY protocol to Varnish. Backend/origin support for SSL/TLS has been added, so VCP can now talk encrypted to your backends.

    On the client-facing side we are forking the abandoned TLS proxy called stud, and giving a new name: hitch.

    hitch will live on github as a standalone open source project, and we are happy to review patches/pull requests made by the community. Here is the source code: https://github.com/varnish/hitch

    We’ve picked all the important patches from the flora of forks, and merged it all into a hopefully stable tool. Some of the new stuff includes: TLS1.1, TLS1.2, SNI, wildcard certs, multiple listening sockets. See the CHANGES.rst file updates.

    Varnish Software will provide support on it for commercial uses, under the current Varnish Plus product package.


    by Lasse Karstensen at 2015-05-15 15:05

Pages