Security issue discovered: Are you performing SSL decryption with Websense? Read this.

This post has 39 Replies | 18 Followers

Top 50 Contributor
Posts 69
tom1231 Posted: 13 Oct 2011 11:57 AM

To date I've created/commented on other threads on this forum regarding this, but this thread serves to centralize this issue.  Since the google crawler seem to hit these forums, hopefully this will get some attention.

If you are reading this thread and the issues below pertains to you as well, please comment below.

My corporation chose to purchase Websense in order to perform web filtering, as well as MITM (man in the middle) SSL decryption/monitoring for Data Loss Prevention.

Currently, as it stands,  for a secure implementation of Websense, if SSL decryption is enabled, and you are using an internal certificate to present to end users,  you must enable the Certificate Verification Engine feature in the Websense Content gateway.  What this feature does is perform various checks against the external SSL certificate to confirm the validity of that certificate.  

If you do not enable this certificate engine while performing SSL decryption, you are flying blind, essentially, as other MITM schemes and invalid cert issuers can intercept your data, and no one in your organization will know.  (e.g. think about the recent issues with Diginotar certs being hacked and gmail victims falling prey)

For example, let's use the example of visiting https://www.gmail.com.  With SSL decryption enabled, end users will see that this website is using a valid certificate, one that is issued by your company internally; essentially masking the actual SSL certificate.  The verification engine then should validate the external SSL certificate.  If this validation fails, then a warning should be displayed to the end user -- a warning much like if you visited a site with an expired/invalid certificate.

To date, the verification engine feature does not work without causing massive issues in an environment.

Here are two issues that I've identified so far:

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4

  • [Minor] When Websense validates a certificate, there is an option to check for CRL (certificate revocation list) to determine if a certificate has been revoked.  The problem with this is, there are many certificates issued/used on the internet that seemingly have problems/ don't adhere to this standard.   (not sure why)  The easy solution would be to disable the CRL check option under the verification engine.  However, this disablement does not currently work.    This results in many end users bombarding the helpdesk wondering what websense block "verify deny = 0" means.
  • [Major] Certain websites, such as wellsfargo.com do not load properly, or do not load at all via SSL.  This is an intermittent issue.   Since this is a banking website, it is imperative to have SSL work.  I have provided logs.  I have provided data dumps, I have spent numerous hours troubleshooting this issue with Websense.  Websense has even been able to reproduce this issue, but I have been told that I will need to impact my production environment further by enabling this feature on long term to collect more dumps.  This becomes a problem, as the [minor] issue above causes the helpdesk line to flood.  Because of this, my 6+ month case has been closed, pending results for the issue above.

This issue has been escalated to the point where a Sr. Manager of Technical Support has been involved, but still, no real traction yet.  To be fair,  it's only been 6+ months of troubleshooting/waiting.

The most troubling thing I've seen is that it appears that others on this forum who use SSL decryption simply acknowledge that this is an issue and simply ignore/disable the verification engine.  They've accepted the risk as an technical engineer, but I can only but wonder if their IT management staff realize the data security ramifications.

Anyhow...

If you are reading this as a potential websense customer:  Be aware of this issue.  I'm not happy about this situation at all.    This is a web security problem.

If you are reading this as another company who is using SSL decryption, and have run into these issues, or know of further issues to raise,  chime in below. 

If you are a websense staff member and care to check out my claims or offer some solutions, please do so!  I welcome any/all comments, both positive or negative. Both cases associated to my account have been escalated to backline, while one is currently closed pending results from the other case.

I'll be continually updating this thread, if it does not end up getting brownholed.

|
Top 50 Contributor
Posts 39
Here is a description of the issues we are having with SSL decryption. There are two problem areas: certificate errors and broken SSL. Websense Tech Support says the SSL proxy is SCIP from www.microdasys.com. We are asking Websense to work with us on a plan to resolve these issues.

Environment

Three non-clustered V10K appliances managed by one TRITON server using 7.6.0 code. We’re using WCCP with all appliances. Number of users across all appliances is approximately 1000.

Certificate Errors

We see a lot of certificate errors. These show up as incidents in the WCG. Many of the certificate errors appear to be legitimate, and they are evidence of bad certificate use on the Internet.

However, the vast majority of incidents are incidents where the ONLY error is “Unknown revocation state”. In fact, over 94% of the 541,966 certificate errors the proxy has seen since September fall into this category. These occur for 81 domain names, including well-known domain names such as facebook, google, aol, att, aetna, hp, amazon, citibank, and akamai. In some cases, the WCG will display the certificate errors to the user. The user can accept the warnings and the page will display. In other cases, the certificate errors break the site, and the page or elements of the page do not display.

If the certificate errors are displayed to the user, often the user must click through multiple warnings before the page displays. We have seen cases where the browser displays its own certificate error page, and when the user accepts that warning, the proxy displays its warning pages.

Our normal operational procedure for handling unknown revocation state errors is to create an Allow rule from the incident. We do this only to fix a broken site when users complain. We are wary of the security implications of this action, for example creating an opportunity for a man-in-the-middle attack.

The unknown revocation state errors seem to come and go for a site. For example, we created an Allow rule for www.facebook.com to fix an issue with facebook, and a few weeks later we removed that rule without the problem returning. One day recently, we suddenly started getting unknown revocation state errors for another well-known web site, and users complained of not being able to access the site. An Allow rule for this incident resolved the problem. We don’t know why we suddenly started getting this error for this site. What this means, though, is we can’t be sure from one day to the next what web sites will break.

Issues which need to be addressed immediately:

1. We need to know why the unknown revocation state errors are occurring. We don’t have the visibility into the WCG on the V10K to obtain this information. We don’t know whether the error occurs because of a problem on our end, a problem at the destination web site, or a problem at the CA.

2. We need to understand the security aspects of unknown revocation state and creating an Allow rule for these incidents.

3. We need to be able to rely on certificate validation to operate consistently from day to day. We can’t have web sites working one day and suddenly not working the next day due to unknown revocation state.

4. We need certificate validation behavior to be consistent with the validation behavior of browsers. When we remove the proxy, we can visit these sites with no certificate errors. If a browser accepts a certificate, so should the certificate validation engine.

Broken SSL

We see a lot of cases where the proxy breaks SSL/TLS communication. In these cases, the proxy prevents the user from accessing an HTTPS site, or it prevents a software program from accessing a network service over SSL/TLS. As a result, we have to bypass the proxy’s SSL decryption for the web site or network service.

In cases where an HTTPS web site is being visited by a user, the user sees a Websense page indicating a connection with the web server could not be established. The Websense page says “Could not connect to server”, with one of two reasons: “Peer suddenly disconnected” or “SSL/TLS Protocol alert: Handshake failure: Possibly no shared cipher”.

For “peer suddenly disconnected”, we have seen cases where disabling TLS1.0 in the browser will solve the problem. We have also seen cases where it does not. We tried disabling SNI, but we have yet to see this solve the problem.

Our normal operational procedure for fixing broken SSL issues is to bypass SSL decryption. First we try adding the bypass in TRITON. If that does not work (sometimes it doesn’t), we add the bypass to the WCG (this is called a tunnel in the WCG). If that does not work, we add the bypass to the Cisco ASA firewall (via IP addresses exempted from WCCP redirects). This is a cumbersome process involving multiple test iterations with users. We don’t know why bypasses sometimes don’t work in TRITON and why they sometimes don’t work in the WCG. We want to add the bypasses to TRITON so we can manage the bypasses centrally and not have to add a tunnel to each appliance/WCG or add an exception to each ASA firewall.

Sometimes SSL connections break outside of user browsing activity. These cases are where a software program is establishing the SSL connection. These cases include client programs running on the user’s desktop, such as online meeting software and software update mechanisms. These cases also include device and server software communicating out to web-based services, such as email hosting providers and postage metering systems. So far, we have relied on our users/administrators to notice these cases and report them to us. We don’t know how to find these cases proactively.

We don’t know how frequently broken SSL is occurring. We have 34 domains and 3 IPs bypassed in TRITON, 9 domains and 1 IP bypassed in one WCG (in addition to the default tunnels), and 2 IPs bypassed on one Cisco ASA firewall (in addition to the address blocks we had to bypass for Webex because the default WCG tunnel for *.webex.com did not work). We are concerned about the unmonitorable channels the SSL bypasses create into and out of our network, which could be used by attackers to bypass ACE and DLP.

Issues which need to be addressed immediately:

1. We need to know why broken SSL is occurring. Browsers establish SSL/TLS connections with the broken SSL web sites with no issues.

2. We need to know when broken SSL is occurring. We cannot remain in a position where we have to rely on users/administrators to let us know about broken SSL. We may have important network services (e.g. software update mechanisms) which are broken and we don’t know about it. We need to know what sites and IPs the broken SSL occurs for, the reason it occurs, and how often it occurs (statistics).

3. We need bypasses added to TRITON to work in all cases.

4.We need the proxy’s SSL implementation needs to be on par with browsers. Like browsers, the proxy needs to be able to accommodate the various SSL implementations on the Internet. There should be no difference in the ability of a browser to conduct SSL/TLS communication when the proxy is used or not used.
|
Not Ranked
Posts 4

CSCOTT, do you work where I work? This is almost verbatim the problems we run into and the workarounds we've implemented, although our ACL exclusions for WCCP redirects are much broader.

|
Top 10 Contributor
Posts 986
Trusted Users (MVP)

The longer I keep SSL Decryption in production (been a few months now) the more I start thinking I need to begin excluding more categories entirely from the process.  Even Websense technicians recommend that to me.  I may eventually dial it back so only sites I really want to be able to decrypt (Email, Social Networking, etc) are decrypted and everything else is excluded.  It's just becoming too much of a hassle otherwise.

I turned off Certificate Validation on Day 1 of production.  It's buggy and too difficult to fix sometimes.  The only thing I lose is validating CA's but if you've been keeping up with the news this year with all the CA's being compromised it seems that aspect of SSL encryption is becoming less and less useful.  Besides, it was either turn off Cert Validation or turn off SSL Decryption.

|
Top 50 Contributor
Posts 39

Jestertoo, thank you for letting us know we are not alone in the issues we are having with SSL decryption. Tom1231, thank you for starting this thread and Glitch for adding your comments as well. We've been in contact with our Websense account manager, who is looking for evidence other customers are having issues with SSL decryption. Hopefully, the information brought out in this forum will help Websense put priority on these issues.

Here's an update since my last post.

We upgraded to 7.6.2. The release notes don't indicate any changes in the SSL proxy, and so far we have not seen any improvements in SSL proxy behavior (it's been only a few days though).

We disabled SNI on all appliances. Since then, we have not had any reports of Connect Errors due to Peer Suddenly Disconnected, and we have not seen any adverse effects from disabling SNI.

We saw some specific cases indicating larger issues with certificate verification:

1. A user had issues visiting google docs and google calendar, and we saw over 50 incidents with only "unknown revocation state" errors for *.google.com. The incidents were for a variety of URL domains including google.com, youtube.com, and ytimg.com.

2. A user had issues visiting the hootsuite and pinterest web sites, which have content hosted on cloudfront, amazon's CDN service. We saw over 70 incidents with only "unknown revocation state" errors for *.cloudfront.net. The incidents were for a variety of domains -- cloudfront seems to manufacture domain names such as d1nu2m22elx8m.cloudfront.net and dvguhnjbfi9ks.cloudfront.net.

3. For https://wm01.harvardpilgrim.org, the SSL proxy says the certificate is expired but it's not. When the proxy is removed, IE and Firefox verify the certificate with no errors.

4. For www.digicert.com > My Account, the page is missing content/functionality. The feedback from websense development, via tech support, is the proxy can't treat https sites using subject alternative names correctly. I've asked for adetailed explanation, and whether this can be fixed or is inherent with a transparent proxy. I believe a lot of sites use the subjectAltName X.509v3 certificate extension and not handling that correctly could cause a lot of problems.

In general, development (via tech support) seems to be claiming the unknown revocation state errors are a problem with the web sites, not the SSL proxy.

Disabling the certificate verification engine would solve a lot of problems, but we don't want to do that. It would be hard for us to explain to our large tech savvy user population that not only are we cracking open their SSL, but we're hiding MITM attacks from them.

Disabling the SSL proxy entirely is an option. It's having a high operational cost and user perception cost. But then we lose threat protection (ACE and DLP) entirely for https sites. Recent month statistics show 7,651 blocks with 66 of them on https URLs, so malicious activity over https is a small percentage, but certainly not negligible.

Disabling the SSL proxy for large blocks of categories may help, but we don't have much confidence it would help given the nature of the decryption issues.

I have to give credit to tech support and to our account manager who have been very helpful in beginning the process to get these issues resolved.

|
Top 50 Contributor
Posts 39

Jestertoo, thank you for letting us know we are not alone in the issues we are having with SSL decryption. Tom1231, thank you for starting this thread and Glitch for adding your comments as well. We've been in contact with our Websense account manager, who is looking for evidence other customers are having issues with SSL decryption. Hopefully, the information brought out in this forum will help Websense put priority on these issues.

Here's an update since my last post.

We upgraded to 7.6.2. The release notes don't indicate any changes in the SSL proxy, and so far we have not seen any improvements in SSL proxy behavior (it's been only a few days though).

We disabled SNI on all appliances. Since then, we have not had any reports of Connect Errors due to Peer Suddenly Disconnected, and we have not seen any adverse effects from disabling SNI.

We saw some specific cases indicating larger issues with certificate verification:

1. A user had issues visiting google docs and google calendar, and we saw over 50 incidents with only "unknown revocation state" errors for *.google.com. The incidents were for a variety of URL domains including google.com, youtube.com, and ytimg.com.

2. A user had issues visiting the hootsuite and pinterest web sites, which have content hosted on cloudfront, amazon's CDN service. We saw over 70 incidents with only "unknown revocation state" errors for *.cloudfront.net. The incidents were for a variety of domains -- cloudfront seems to manufacture domain names such as d1nu2m22elx8m.cloudfront.net and dvguhnjbfi9ks.cloudfront.net.

3. For wm01 [dot] harvardpilgrim [dot] org, the SSL proxy says the certificate is expired but it's not. When the proxy is removed, IE and Firefox verify the certificate with no errors.

4. For www [dot] digicert [dot] com > My Account, the page is missing content/functionality. The feedback from websense development, via tech support, is the proxy can't treat https sites using subject alternative names correctly. I've asked for an explanation and whether this can be fixed or is inherent with a transparent proxy. I believe a lot of sites use the subjectAltName X.509v3 certificate extension and not handling that correctly could cause a lot of problems.

In general, development (via tech support) seems to be claiming the unknown revocation state errors are a problem with the web sites, not the SSL proxy.

Disabling the certificate verification engine would solve a lot of problems, but we don't want to do that. It would be hard for us to explain to our large tech savvy user population that not only are we cracking open their SSL, but we're hiding MITM attacks from them.

Disabling the SSL proxy entirely is an option. It's having a high operational cost and user perception cost. But then we lose threat protection (ACE and DLP) entirely for https sites. Recent month statistics show 7,651 blocks with 66 of them on https URLs, so malicious activity over https is a small percentage, but certainly not negligible.

Disabling the SSL proxy for large blocks of categories may help, but we don't have much confidence it would help given the nature of the decryption issues.

I have to give credit to tech support and to our account manager who have been very helpful in beginning the process to get these issues resolved.

|
Not Ranked
Posts 3

What is SNI?

|
Top 50 Contributor
Posts 39

Another piece of information relative to certficate verification. Development (via tech support) says the certificate verification engine (SCIP) checks a local sqlite3 database before it verifies a certificate. If an older "record" is found, it will use the older record. These must be removed from the incidents list by selecting Action > Remove. As I understand it, this means once there is an incident in the database for a certificate verification failure, future visits to the site will result in that failure being displayed to the user, even if the problem no longer exists. This would mean it is necessary to periodicaly (eg daily) delete all the "open" incidents in the incident list, which can tak a lot of time as they must be done one-by-one (and if you're getting tens or hundreds in a day, this is very time consuming). Of course the incidents can be turned into Allow rules, but this seems to create the opportunity for MITM attacks, and I think Allow rules make sense only in very rare cases and are defintely not be a long-term solution for bugs in the verification engine.

|
Top 50 Contributor
Posts 39
SNI stands for Server Name Indication. If you search the forums for "peer suddenly disconnected found", you'll see another thread about it. Also, Wikipedia has an article about it. This is enabled by default in the WCG, it should probably be disabled by default. Either way, there should be a way in the UI to change it as right now you need to get tech support to go into the appliance and disable it (if you use appliances). Why browsers don't seem to have issues with SNI while the WCG does is a mystery.
|
Not Ranked
Posts 3

Thanks for the clarification on SNI. These issues are troubling - and issues with SSL isn't the only thing we're running into. It is actually a relief (sad to say) that others are having issues with 7.6.

|
Top 50 Contributor
Posts 69

@kls: not with v7.6. Just with Websense in general. v7.5 afaik had the exact same issue, if not far worse. (in fact, it was quite a bit worse, at least from my own experience.)

 In fact, I have case opened right now discussing this issue (to an extent) with gmail.com. Engineering had suggested to simply "tunnel" google as a whole. Problem is, decryption errors are happening on, as CScott mentioned above, on like half of the internets.  (Yes, internets).

I've actually raised this issue with the support manager and I believe the only reason he got brought into the mix was because my case was actually left opened for over 6 months.  Yes.  6 month old case.  I guess his supervisors occasionally look through the queue and pinpoint aging tickets.  

So my suggestion is,  if you have a case that's opened, leave it opened.  do not cave, and do not let them close it.  Get them to fix the engine.  Send it back to backline engineering each time they come back with some trivial solution, like creating an SSL tunnel incident for https://*.*  Angry.  Okay, that part might be an exaggeration but their solution is truly to just tunnel everything!

"How are you?  After further checking with engineering, the only other viable options are

 1. Delete any incident created/automatically added for any google domain.

2. Create an allow incident https://gmail.com to ignore validation error.

 *OR*

3. Create tunnel incident.

Are you receiving a lot of validation errors from a lot of different domains? Because, you should be able to enter in the incident list for gmail one time."

So far,  not impressed with the level of attention any of my cases have received from devel.  currently working off of case #813166 to address a validation error that appears with gmail.com.    My original cases that are now like maybe a 1 year old (case #785987), are closed pending solution to 813166.  It's all truly part of the same issue (validation engine that doesnt work), but i guess if they can close super old cases, then management wont see the issue, and if management doesnt see the problem, the problem isnt really a problem, right?

 

 

|
Top 10 Contributor
Posts 986
Trusted Users (MVP)

I hate to suggest it-- but I really think turning off Certificate Validation is the only way to save yourself these headaches.  It sucks, but I can't tell you how much easier my life is as a Websense administrator with it off. 

The security benefits of doing the validation is questionable anyway; how many of your users actually read Certificate warnings instead of just clicking through them?  For me most of my users started calling our Support number because they just saw the Websense logo and thought it was blocked.

Seriously -- give it a try and see how it changes things for you.  The vast majority of these issues disappear and you'll come to realize what little value Validation adds (especially given the trouble it adds).  It's simply not worth it to implement.  It should be, but it isn't.

|
Top 10 Contributor
Posts 480

Since we're discussing certificates, does your Internal CA, the one that generates the MITM certificates, sign them using MD5 instead of SHA-1? We just threw up our first Websense eval and it uses 7.6.2 and Firefox's SSL Blacklist add-in complained immediately and they do show as MD5. That was brought up as an issue with SSL certificates in general over three years ago.

Is there a way to change the signing algorithm?

Thanks,

Ray

|
Not Ranked
Posts 4

Some users have secure browser plugins that automatically try to use https for every site. An example of where this bites us is www.igoogle.com, which is a cname to www.google.com.

Common name matching will create an incident for igoogle. It also creates a second incident for the www.google.com certificate which then denies all legit requests to www.google.com

I'm making a feature request to at least get a toggle for this. I can't manually check 8 WCGs every day to remove incidents caused by this.

|
Not Ranked
Posts 4

RE: turning off validation

This is a ridiculous suggestion, especially in light of the recent compromised Root CA authorities.

|
Page 1 of 3 (40 items) 1 2 3 Next > | RSS