Security and compatibility behind new "STW Preview Verification code" JS embedding scheme

Update: The content in this post maybe out of date or inaccurate.

So, I just read about the latest obnoxious service change. Starting from Nov 1
the simple <img> link will stop working, and it gets replaced by a Javascript
snippet that does basically the same (for now), right? Please answer some of
the overlooked questions with that.

First, document.write() cannot be used in true XHTML documents.
http://www.w3.org/MarkUp/2004/xhtml-faq#docwrite
So, ehm, how is this going to work exactly?

And then, if XHTML won't be supported anyway, why does the Javascript not just
use HTML/SGML syntax. It makes very very little sense to use "<img .. />" with
a trailing slash. That's tag garbage in HTML. (Remember, the only thing your
Javascript snippet will actually run with).

And thirdly, where are the privacy and security statements regarding that
change?
This Javascript snippet executes in my domains security context, has access
to local cookies an whatnot. And it's neither guaranteed that it won't inject
tracking cookies of its own. (And yes, some countries have privavcy laws that
reglement that.)
So why isn't it served as versioned Javascript file from a CDN that ensures
that no context violations can occur?

--

Quite frankly, with the neverending changes and downtimes and misconfigurations
STW is getting more work than it's worth. A simple wkhtmltoimage setup seems
to become way less maintenence and effort.

Imagen de puravida
puravida
Jedi Warrior
Desconectado
Integró: 09/01/2007
Visitar puravida's Sitio web
Cotización:
Starting from Nov 1, the simple link will stop working, and it gets replaced by a Javascript snippet that does basically the same (for now), right? Please answer some of the overlooked questions with that.

I suppose that "stop working" is a relative term, since you joined after the Preview Verification Page was the only thing we provide to free users. It is the only sample code given by default and is all over the marketing pages for free users.

If you dug around the site to figure out how to use the "Embedded" or "Advanced API" methods, then that's great that you took the time to research; but it was -technically- exploiting a loophole that just hasn't been closed yet --because we have delayed for six(6) months while we attempted to communicate this change.

Cotización:
First, document.write() cannot be used in true XHTML documents.
http://www.w3.org/MarkUp/2004/xhtml-faq#docwrite
So, ehm, how is this going to work exactly?

No idea. We haven't had the time (or any requests) to investigate how to support XHTML. Instead of being aggressive about the change, we (and possibly other users) would welcome your feedback and assistance. But just complaining doesn't do anyone any good.

Cotización:
why does the Javascript not just use HTML/SGML syntax. It makes very very little sense to use "" with a trailing slash. That's tag garbage in HTML. (Remember, the only thing your Javascript snippet will actually run with).

Pretty much the same answer as above, but it sounds like you know enough to also know the answer to this question; though I'll answer for the benefit of others who may not be as knowledgeable.

We provide "sample" code for various types of implementations. It is confusing and unrealistic (especially for the typical free user) to have to choose their exact configuration and document type to get a snippet of code that works perfectly for them. Therefore, we provide something that works in most cases. For edge cases, like yours, the snippet would work, but we expect you to know how to make the slight adjustments, if it's important to you. I think, judging by the level of expertise you convey, that you know how to do that.

Cotización:
And thirdly, where are the privacy and security statements regarding that
change? This Javascript snippet executes in my domains security context, has access to local cookies an whatnot. And it's neither guaranteed that it won't inject
tracking cookies of its own. (And yes, some countries have privavcy laws that
reglement that.)

This seems like an "edge case" request that most users would not read. If you ask nicely like most of our other users Wink, we would gladly put up a page to highlight a privacy policy that we won't embed cookies or attempt to access cookies cross-domain, etc.

Cotización:
So why isn't it served as versioned Javascript file from a CDN that ensures
that no context violations can occur?

I can think of ways that a CDN can be compromised, but if it makes you warm-and-fuzzy; I will let you know that you "could" access it from the CDN also. However, that is handled as an edge case, because we already rolled out the code using our primary domain. You can use either one, so we don't have to have all users update their code.

The code to use in the HEAD would be:
<script type="text/javascript" src="http://c249773.r73.cf1.rackcdn.com/pagepix.js"></script>

Imagen de puravida
puravida
Jedi Warrior
Desconectado
Integró: 09/01/2007
Visitar puravida's Sitio web

I forgot about the last part:

Cotización:
Quite frankly, with the neverending changes and downtimes and misconfigurations
STW is getting more work than it's worth. A simple wkhtmltoimage setup seems
to become way less maintenence and effort.

There are many similar services out there but each has its hiccups. I won't say anything negative about them, in case you want to test them out. Smile

I will say, though, that if you want to implement a script or try to do it in house (and are not able or willing to use the PVP), you might want to consider that if it takes more than $4.95/month in bandwidth or computing costs; then you are fighting a losing battle. If it takes an hour of your time, then you've defeated the purpose instead of just automating it with a $4.95/month account upgrade. Just saying.

And there has been very little "downtime" --seeing as how we have an excellent uptime over the three years we've been around, despite a storage failure that kept us down for a whole day this year. And the popular competing services have seen weeks (or months in some cases) of downtime (assuming they didn't just go belly up), while we've enjoyed a 99.95% uptime until the recent outage, which dropped us to 99.88% uptime (still excellent by any definition).

horschaek
Desconectado
Integró: 05/22/2009
Visitar horschaek's Sitio web

Hi at all,

Brandon maybe remember, but for rest of you I'd like to tell you, that I'm a native german speaker - so sometimes I might not understand all in the first go (or I leave/forget some letters when typing and my sentence structure sometimes turns into the german one).
Therefor I have to beg your pardon.
I swear, my spoken english is much better - saying my colleagues from australia .

Hello Brandon,

am I right, that there's no possibility to turn off the "STW Preview Verification" (PVP) by default?

I'm asking because of four points:

  1. The user clicks on the thumbnail and gets something completly differnt .
    Only in the second view the user may see/understand/recognize, that this is a second preview of the site he/she expected.
  2. It takes several time, untill the PVP-site is loaded ; much longer than the final link-target (e.g. time-from-click-to-first-bite: 2.707 sec [PVP] 0.413 sec [direct]).
  3. And when the PVP-site already has loaded, it is only in english .
    An absolutely no-go for non-english users when visiting a non-english website and expecting a non-english link-target.
  4. The disable-link on the bottom is much to small - make it a button like the "Continue ..."-button.

I would appreciate, if it would be possibile to turn off the PVP by default Star.

If -in any way- this is not possible (should be, implement the new "[OPTIONAL] preview" attribute) or not wanted by STW (why?), please check out the loading speed of the PVP-site immediately. And do offer it in different languages by using the "[OPTIONAL] language" attribute.

For the german version of the PVP-site, I do offer you the translation now (hopefully you implement it) :

Instant Previews provided by
Sofort-Vorschauen präsentiert von

You are viewing a preview of the following web page:
Sie sehen eine Vorschau der Webseite:

The purpose of this page is to provide enough details for you to decide if you want to visit the web page shown below.
Der Zweck dieser Seite ist es, Ihnen Informationen zu liefern, um zu entscheiden, ob Sie die unten angezeigte Seite besuchen möchten.

No title was found.
Es wurde kein Titel auf der Webseite gefunden.

No description was found.
Es wurde keine Beschreibung auf der Webseite gefunden.

No tags were found.
Es wurden keine Schlagworte auf der Webseite gefunden.

Continue on to the web page ...
Weiter zur Webseite ...

Click here to disable this preview page
Hier klicken, um diese Vorschauseite zukünftig zu überspringen

greetings from germany (Heidelberg)
horschaek

Imagen de puravida
puravida
Jedi Warrior
Desconectado
Integró: 09/01/2007
Visitar puravida's Sitio web

@horschaek
That's excellent! Thank you.

We will, hopefully, be adding language support before Jan 29th, 2012. Now, German will be the first language we add. Wink

Cheers,

Brandon

horschaek
Desconectado
Integró: 05/22/2009
Visitar horschaek's Sitio web

@puravida

hmmm, a few months ago I told myself, I have to track something ... Wink

hairdressing
Desconectado
Integró: 09/13/2011
Visitar hairdressing's Sitio web

Thanks for the heads up about wkhtmltoimage

milki
Desconectado
Integró: 10/03/2010
Visitar milki's Sitio web

Since I was apparently using an outdated API and STW is moving to more flashy features, I figured it's time to move on. And it turned out to be very little effort, and actually much snappier on my shared host. Just using thumbnails for a few personal projects, so didn't really need a distributed service, or spooler, or anything fancy in the first place.

I'm sharing the code here for the interested. It comes with no support whatsoever. Using it is fairly simple, but requires above average experience with configuration/code. Further security precautions might be necessary for other setups.

Anyhow, I've deployed a transparent thumbnail generation feature. It sits on a separate virtual host, and expects URLs to be thumbnailed in double-urlencoded form, plus .jpeg suffix:

http://preview.example.com/http%253A%252F%252Fwww.google.com%252F.jpeg

If the image exists, Apache will deliver it straight away. Otherwise a generation script will engage. Here's the magic .htaccess to make that work:

#-- transparent preview.php image generation
#
RewriteEngine On
RewriteBase  /

# Probes if requested filename matches "http.....jpeg"
# If such a files does not exist, then redirect through img generator.
#
RewriteCond  %{REQUEST_URI}    /((http.+)\.(png|jpeg))$
RewriteCond  %{DOCUMENT_ROOT}/%1     !-f
RewriteRule  ^.+$           ./.preview.php?url=%2&type=%3  [L]

# Fallback for empty/failed thumbnails
RewriteCond  %{REQUEST_URI}    /((http.+)\.(png|jpeg))$
RewriteCond  %{DOCUMENT_ROOT}/%1     -f
RewriteCond  %{DOCUMENT_ROOT}/%1     !-s
RewriteRule  ^.+$                    .placeholder.jpeg

#-- security: move preview.php out of image directory and disable exec
#
#Options -Exec -MultiViews
#php_value engine Off

And this is the very simplistic thumbnail generation script (requires wkhtmltoimage):

<?php
/**
 * Generates website preview images.
 *
 * Needs transparent .htaccess RewriteRule and .wkhtmltoimage
 * in the current dir (path should either be writable or suphp
 * enabled).
 *
 * Images must be requested using double-urlencoded notation:
 * http://img.example.com/http%253A%252F%252Fwww.google.com%252F.jpeg
 * (They are stored single-urlencoded on disk. So that that RewriteCond
 * checks can associate them.)
 *
 * A cron job should regularily clean out older thumbnails.
 *
 */


#-- parameters
$restrict_referer = "#^$|^\s*http://([\w.]+\bEXAMPLE.COM|www)/#e";
$type = "jpeg";
# The incoming URL was double-urlencoded, so needs to be cleaned up once,
# filter_var for minimum validation (does not guarantee http URLs)
$url = filter_var(urldecode($_GET["url"]), FILTER_VALIDATE_URL);


#-- restrain referer
if (!preg_match($restrict_referer, $_SERVER["HTTP_REFERER"])) {
   die("Referer mismatch ".header("Status: 403"));
}


#-- execute wkhtmltoimage
if (preg_match("{^https?://[\w\-.?&#=+%/~:,;()\[\]\pL]+$}u", $url)) {

    # escape and wrap cmdline arguments
    # the image filename will be just "URL"+".JPEG"
    $img_fn = urlencode("$url.$type");
    touch($img_fn);
    $img = escapeshellarg($img_fn);
    $url = escapeshellarg($url);

    # run WebKit binary to create image file in current directory
    exec ("nice ../../files/wkhtmltoimage --format '{$type}' --quality 80 ".
          " --zoom 0.32 --width 320 --height 240 ".
          " --disable-local-file-access --disable-javascript ".
          " {$url} {$img} ".            #--disable-plugins
          " 2>/dev/stdout");
    # --crop-h 250 --crop-w 400
}


#-- redirect to image,
#   when generation was successful
if (isset($img_fn) && file_exists($img_fn)) {
    $img = urlencode($img_fn);
    header("Location: ./$img");
    print "<a href=$img>$img</a>";
}

?>

Notice that at least the referer needs updating. Current dir must be writable (suexec/suphp setup or chmod permissions). And again, only reall suitable for minor usage. (I might do a spooler-version, doesn't seem much effort either.)

Oh, and as cron-job something like:

# Refresh every 21 days
#
find /usr/web123/html/preview/ -name "*.jpeg" -mtime +21 -delete

# Remove empty files after 3 days
#
find /usr/web123/html/preview/ -name "*.jpeg" -empty -mtime +2 -delete
Imagen de puravida
puravida
Jedi Warrior
Desconectado
Integró: 09/01/2007
Visitar puravida's Sitio web

That's actually very cool that you got that working, milki.

After a quick code review, it looks like that implementation may work for some people. It's quick and dirty and I'm not sure if it captures Flash (noticed that Google's screenshots do not capture flash), but it should work. Kudos!

It's cool how far things have come, because back when I pioneered the technology; it took 3 months of arduous code and testing to get working on *nix. Now it's just a simple copy/paste and some configuration to get a basic implementation.

Topic locked

ShrinkTheWeb® (Sobre STW) es otra innovación de Neosys Consulting

Contáctenos | PagePix Benefits | Learn More | Nuestros socios | Política de privacidad | Términos de uso

©2014 ShrinkTheWeb. All rights reserved. ShrinkTheWeb is a registered trademark of ShrinkTheWeb.