Category : Web

Getting WordPress Prettylinks working in IIS

Today, I was restoring a backup of a WordPress site I have. Copied all the files into a folder, squirted all the data back into MySQL and set up a separate website in IIS. All was fine, I could browse to the main page www.domain.com/index.php and got the page I expected, but clicking on any of the links gave me a 404 error. A bit of checking uncovered that it was probably down to the pretty links that WordPress was configured to use.

It seems that IIS interprets http://www.domain.com/category/blog-post-title/ as a request to view the folder contents or the default page in that particular folder ( /category/blog-post-title ), when in actual fact WordPress is expecting that URL to be passed to index.php so that it can be parsed and the correct content rendered.

Anyway, this can be solved via a simple IIS Rewrite Rule. To get this working add the following to the web.config file that is in the same folder as all the WordPress files :-

<rewrite>
    <rules>
        <rule name="Main Rule" stopProcessing="true">
            <match url=".*" />
            <conditions logicalGrouping="MatchAll">
                <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
                <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
            </conditions>
            <action type="Rewrite" url="index.php" />
        </rule>
    </rules>
</rewrite>

 

This needs to into the <system.webServer> section.

 

New Gmail Design

imageOkay, so I like the new Gmail design – cleaner and more whitespace – just looks much more modern that the existing one.

But it’s still bloody annoying !

The thing is, I hate having to go across to the little ‘back arrow’ reply icon, click the dropdown next to it, just to get to the ‘actions’ I can carry out on an email.

‘Forwarding’ is two clicks when I simply want a single click for it. I’m surprised no one has come up with a greasemonkey script that does this – maybe I need to stop whinging and write myself one.

Microsoft do a good job in the old Outlook Web Access (2003 and 2007) and in the newer Outlook Web App (2010)

image

What do you think ?

SportyPal Screen Scraping with PowerShell

imageFor the past year (2010) I had been using SportyPal – an application for tracking exercise (runs mainly for me). It has mobile apps (iPhone, Android, WinMob etc.) that do the actual tracking and then upload the data to their website where you can view history, graphs, charts, records etc. It really is neat, and the user interface on the mobile device really looks good (at least the WinMob and Android versions I used)…

I have been pinging them on their forums for a while now about when their subscription service would be launched and about getting a subscription trial – they announced that they expected it in Summer 2010, it got delayed and delayed and although I got vague answers about release dates from their forum and twitter responses, it still was not available by the end of December 2010.

imageSo, much as I loved their app, it was time to switch – RunKeeper was the new app/service I chose. The problem I faced was how to get my years worth of data out of SportyPal – they do allow you to export GPX data for each run, but it’s on a run by run, manual basis – not good.

Time to crack open PowerShell…

A bit of poking around with Chrome Developer Tools and Fiddler2 identified the sequence for logging in and the format of the history/activity page that lists all runs.
So now I had a basis for screen-scraping the data I needed, and I also notice that for each run there was a link to download the GPX.

I put together a script that would login, open the activities page and grab the data about each run, it also downloads a copy of the GPS data to a separate file for each run, with the filename set as the date/time stamp of the run.
I couldn’t see any easy method of importing (but to be honest I didn’t look for very long), but as I only had 100 runs to import I simply did it manually (RunKeepers import function is only about 3 clicks).

Initially I found that every GPX file I imported came up with an error about the GPX being invalid, however, after a browse around the forums I found that one of the namespaces was incorrect (http://www.topografix.com/GPX/1/0 when it should be http://www.topografix.com/GPX/1/1). This did the trick and now on importing each GPX file the correct run details, route etc all showed up – so all was good.

Here’s the PowerShell script – it logs in to your account, screen scrapes all your activities and then downloads the GPX for each. It also ‘fixes’ the downloaded GPX so it has the correct namespace :

$email = "your_email"
$password = "your_password"


$url = "http://www.sportypal.com/Login"

"Starting..."
""

     [System.Net.HttpWebRequest] $request = [System.Net.HttpWebRequest] [System.Net.WebRequest]::Create($url)
     $cookieJar = new-object "System.Net.CookieContainer"
     $request.CookieContainer = $cookieJar
     $request.Method = "POST"
     $request.ContentType = "application/x-www-form-urlencoded"
     $param = "Email=" + $email + "&Password=" + $password + "&login=Login"

     $request.ContentLength = $param.Length

     [System.IO.StreamWriter] $stOut = new-object System.IO.StreamWriter($request.GetRequestStream(), [System.Text.Encoding]::ASCII)
     $stOut.Write($param)
     $stOut.Write($sourceParam)
     $stOut.Close()

     "Logging in..."
     ""

     [System.Net.HttpWebResponse] $response = [System.Net.HttpWebResponse] $request.GetResponse()

     if ($response.StatusCode -ne 200)
     {
           $result = "Error : " + $response.StatusCode + " : " + $response.StatusDescription
     }
     else
     {
           $sr = New-Object System.IO.StreamReader($response.GetResponseStream())
           $txt = $sr.ReadToEnd()
           $cutstart = $txt.Substring($txt.IndexOf('<table id="my_workouts"'))
           $cutend = $cutstart.Substring(0,$cutstart.IndexOf("</div>"))

           "Getting workouts"
           $workouts = @()
           $ipos = 0
           while(($ipos -ne -1) -and ($ipos -lt ($cutend.Length -1)))
           {
                $s = $cutend.IndexOf("<tr id=", $ipos)
                if ($s -ne -1)
                {
                    $e = $cutend.IndexOf("</tr>", $s)
                }
                else
                {
                    $e = -1
                }
                if(($e -ne -1) -and ($s -ne -1))
                {
                    $tr = $cutend.Substring($s, ($e + 5) - $s)
                    $workouts += $tr
                }
                $ipos = $e
           }

           #$workouts | %{ $id = $_.Substring(11,6); $id }

           "Got " + $workouts.Length + " workouts"
           foreach($wo in $workouts)
           {
                $id = $wo.Substring(11,6)
                $s = $wo.IndexOf("dateval_$id") + 23
                $dt = (New-Object "System.dateTime"(1970,1,1)).AddMilliseconds($wo.Substring($s,13))
                $s = $wo.IndexOf("td_number clickDetails") + 24
                $e = $wo.IndexOf("</td>", $s)
                $dist = $wo.Substring($s, $e-$s).Trim()
                $dist = $dist.Substring(0, $dist.Length-2)
                $s = $wo.IndexOf("td_number clickDetails", $e) + 24
                $e = $wo.IndexOf("</td>", $s)
                $time = $wo.Substring($s, $e-$s).Trim()
                $s = $wo.IndexOf("td_number clickDetails", $e) + 24
                $e = $wo.IndexOf("</td>", $s)
                $cals = $wo.Substring($s, $e-$s).Trim()
                $cals = $cals.Substring(0, $cals.Length-4)
                "Workout on $dt ( ID = $id ) : $dist : $time : $cals calories"

                # now grab the GPX
                $filename = "c:scriptssportypal" + $dt.ToString("yyyy-MM-dd_HHmm") + ".gpx"
                $gpxUrl = "http://www.sportypal.com/Workouts/ExportGPX?workout_id=$id"
                [System.Net.HttpWebRequest] $gpxRequest = [System.Net.HttpWebRequest] [System.Net.WebRequest]::Create($gpxUrl)
                $gpxRequest.CookieContainer = $request.CookieContainer
                $gpxRequest.AllowWriteStreamBuffering = $false
                $gpxResponse = [System.Net.HttpWebResponse]$gpxRequest.GetResponse()
                [System.IO.Stream]$st = $gpxResponse.GetResponseStream()

                # write to disk
                $mode = [System.IO.FileMode]::Create
                $fs = New-Object System.IO.FileStream $filename, $mode
                $read = New-Object byte[] 256
                [int] $count = $st.Read($read, 0, $read.Length)
                while ($count -gt 0)
                {
                    $fs.Write($read, 0, $count)
                    $count = $st.Read($read, 0, $read.Length)
                }
                $fs.Close()
                $st.Close()
                $gpxResponse.Close()
                "- GPX Data in $filename"
           }

     }

     $response.Close()



    # now fix the namespace error
    $files = gci "c:scriptssportypal*.*"
    foreach ($file in $files)
    {
        "Fixing namespace error in " + $file
        $content = get-content $file
        $content[0] = $content[0].Replace('xmlns="http://www.topografix.com/GPX/1/0"', 'xmlns="http://www.topografix.com/GPX/1/1"')
        Set-Content $file $content
    }

    "Complete"

Feel free to use this (at your own risk – I accept no liability whatsoever), but realise that you may be breaking all manner of T’s & C’s for SportyPal.

… and on a final note to anyone from SportyPal :

Sorry. I persevered, I really did, I was ready to throw money at you for a PRO subscription, I could even have lived with a further slipped release date, but not keeping your fans updated, giving no idea of a roadmap – doesn’t make us feel the love”.

GEO: 51.4043807983398 : -1.2872029542923

UPDATE: Thanks to Ricardo, who identified a bug where workouts marked as private would not download correctly. The script is now updated with this fix.

Twitter oAuth from C#

A while back I was working on a ‘post to Twitter’ function that used the original Basic Authentication that the Twitter V1.0 API allowed.

Unfortunately, at the end of Aug 2010 they discontinued support for this and forced everyone to use their oAuth authentication. There are a number of services around (such as SuperTweet) that will proxy your Basic Auth to Twitters oAuth, but really your best bet is to upgrade your code to support oAuth.

It is actually not as difficult as it may sound – here’s how I upgraded my code :-

First go to http://dev.twitter.com/apps/new and register your application. When complete you’ll get the details you need in a page like this :

image

You do need to make sure that you have enabled both Read and Write access if you plan to post updates from your application.

Next, grab this zip file with the oAuth.cs and oAuthTwitter.cs classes that Eran Sandler wrote and Shannon Whitley extended, and add those to your project (make sure you change the namespace !)

Now in your class you need to create an oAuth object and assign the ConsumerKey and ConsumerSecret properties to the values given to you in the Twitter API settings page. Now when someone wants to Authorize your app to work with their account you need to get the authorization link and fire up a browser navigating to that link :

oAuthTwitter _oAuth = new oAuthTwitter();

_oAuth.ConsumerKey = "AAAAAAAAVGJTAZerhSFsafvg";
_oAuth.ConsumerSecret = "AAAAAAAgSgLfwFZDSQ3AZNDA5LwMfPnmPJud7kbCzo";

string authLink = _oAuth.AuthorizationLinkGet();
System.Diagnostics.Process.Start(authLink);

The result of them clicking ‘Allow’ in the web browser is a PIN. You need to use that PIN in a call to AccessTokenGet which finally populates the Token and TokenSecret properties of the oAuth object.

_oAuth.AccessTokenGet(_oAuth.OAuthToken, thePin);
string Token = _oAuth.Token;
string TokenSecret = _oAuth.TokenSecret;

Now (very important) you need to save the Token and TokenSecret for later use (you don’t want your user having to authorize for every update).
To send an update is now a simple affair – create an oAuthTwitter object, set the properties and then use the oAuthWebRequest function:

oAuthTwitter _oAuth = new oAuthTwitter();

_oAuth.ConsumerKey = "12345GJITAZerhSFsafvg";
_oAuth.ConsumerSecret = "lkjghKJGHblgLfwFZDSQ3AZNDA5RLwMfPnmPJud7kbCzo";
_oAuth.PIN = thePin;
_oAuth.Token = theToken;
_oAuth.TokenSecret = theTokenSecret;

string url = "http://twitter.com/statuses/update.xml";
string tweet = System.Web.HttpUtility.UrlEncode("status=" + "Tweet from my application!");
string xml = _oAuth.oAuthWebRequest(oAuthTwitter.Method.POST, url, tweet);

We’re done ! Less than 20 lines of code.

GEO: 51.4043862206013 : -1.28720283508302

Generate sitemaps using PowerShell

I was discussing ‘googlability’  – a new word I made up meaning ‘the ability to find via Google’ – of our knowledgebase with one of the technical guys at work.
It seems that we seldom get matches in Google searches (and the built in search is somewhat lame) – I was quite surprised with the fact that Google wasn’t matching anything.

Looking into it a bit further, I found that although our knowledgebase is public, the Urls are pretty undiscoverable, all having a ‘articleid’ parameter – obviously, the GoogleBot couldn’t just guess at the values and so was skipping the majority of our article, apart from the few listed on the main page.

We needed to give it some hints by adding a sitemap. I (ever so) briefly toyed with adding a sitemap page to the knowledgebase website using the standard XML based sitemap protocol etc, but our site is written in PHP and I didn’t want to get bogged down in all that again…
In a rare burst of being pragmatic and keeping things simple (as opposed to _way_ over engineering a solution) I recalled that Google’s webmaster tools allow you to submit a text file as a sitemap with one Url per line.

I knew the format of the Url for our articles so it just required a bit of PowerShell to generate a bunch of lines containing Urls with sequential numbers and write them to a file. version 1 looked like this :

set-content "c:sitemap.txt" (1..1000 | %{ "http://support.c2c.com/index.php?_m=knowledgebase&_a=viewarticle&kbarticleid=$_&nav=0`n" })

However, uploading this sitemap caused the Google machine to choke and spew out a bunch of errors about invalid Urls… A little more digging uncovered that the text file uploaded must be encoded in UTF8. So version 2 looked like this :

set-content "c:sitemap.txt" (1..1000 | %{ "http://support.c2c.com/index.php?_m=knowledgebase&_a=viewarticle&kbarticleid=$_&nav=0`n" }) -encoding UTF8

Out popped a text file with 1000 Urls, in the correct format, with the correct encoding and accepted by the Google machine with no problems.
Probably 10 minute work all in – I wouldn’t have even got the PHP coding tools fired up in that time – reminder to self “KISS works !!

GEO51.4043502807617:-1.28752994537354

Access ODBC Connection Strings

I was working on an old (classic) ASP page the other day. It was pulling data from an Access database file and using an ODBC driver to get the connection.

It was working fine on a Windows 2003 server, but when I pulled the file into a local website on my Windows 7 machine (with Office 2010 beta) it kept failing at the ODBC layer. The reported error message was :

Microsoft OLE DB Provider for ODBC Drivers error ‘80004005’
[Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified

Looks like the driver specified in my connection string couldn’t be found. I was using the following :

    objConn.Open "DRIVER={Microsoft Access Driver (*.mdb)}; DBQ=c:inetpubwwwrootpstdiscovery.mdb;"

This all looked correct and checking the excellent “ConnectionStrings.com” website they were saying the same thing – strange. It then struck me that I’m using Win 7 and Office 2010, either of which could have changed the ODBC driver or installed a new driver, so checking the “Data Sources (ODBC)” tool I see that the driver also works with .accdb files, so I’m guessing this is an updated driver.

Changing the connection string (adding the *.accdb) was the next step.

objConn.Open "DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; DBQ=c:inetpubwwwrootpstdiscovery.mdb;"

Testing with this new connection string worked fine – problem solved….

image

GEO51.4043502807617:-1.28752994537354

 

WebApp KeepAlive Service

image Recently I have been working with DotNetNuke. This is a superb open source CMS platform running on ASP.NET with a SQL back end, simple to install, easy to use and there is a thriving community around it. It is also available in a ‘Professional’ version which costs around £2000 per year and provides additional workflow features, a support contract and various other benefits.

image I choose not to go for the pro version, just because I don’t see much point in having some aspects of a website/webapp covered with a support agreement, but not others (any third party controls/extensions are not covered, and I had bought the excellent DataSprings Suite and was using multiple components from it).

Anyway, when I had everything working the way I wanted, I started to look at performance. There are many websites providing tools and tips around maximizing performance of DNN sites – this includes caching strategies, output compression, turning off none essential ‘housekeeping tasks’ and the like.
One of the tweaks (for low traffic sites) was to ‘kick’ the site every few minutes to ensure that IIS does not unload it (and therefore need to spin it up again the next time someone visits – this can take a good few seconds).image

  My initial thoughts were around a scheduled PowerShell command – simple to come up a one liner to request a page and thus keep the web app in memory.

(new-object “System.Net.WebClient”).DownloadString(“http://www.website.com/”)

Then I thought I might have it ‘kick’ a few websites, and make it configurable, so I started thinking ‘Windows Service’. It turns out that there are a load of these apps and services available to buy, some targeted specifically at DNN, some more generic. Reluctant to spend $20/year on a service I decided to craft my own.

It is basically a windows service, that reads an XML file for a timeout and a list of urls to kick. It implements a 1 second timer and counts down the timeout value, when it reaches 0 it kicks each of the urls (recording how long the response took). I haven’t done anything with the measured response time, but it would be fairly easy to write out to a DB or file for later analysis…

Here is a .zip file with everything you need – binaries, sample configuration file, install instructions and full source. It is Creative Commons license so knock yourself out.

WebAppKeepAlive.zip

GEO 51.4043197631836:-1.28760504722595

Google Results Ranking

Disclaimer: Screenscraping results like this probably contravening Google’s Terms of Use (or something) and I do not advocate that you do it – this is purely hypothetical, if I did want to do it, this is how I would go about it  😉

Further Disclaimer: The results page formats could change at any time and may well break this script, if that happens you are on your own (FireBug and some modified regex should help you out).

image

So, if you wanted to get the Google ranking of a bunch of domains when searching for a particular term you could use one of the many SEO page ranking test sites that are available, but these are a pain in as much it they require you to enter the search term and the domain name you are looking for and they give you the ranking (what position in the results the domain name comes). that is fine for individual searches (like what position is kapie.com if I search on ‘Ken Hughes’), but not very good for doing a comparison of multiple domains against the search term.

I looked at using Googles Search API to get this info, but unfortunately it only returns 4 or 8 results (it is mainly designed to present some brief results in a box on your website), what I needed was to look at a lot more results (like up to 500)….

Back to my trusty friend – PowerShell…

I create a web client, have it download the first X (500) results to the search term, load the link Url and the position into a hashtable and then lookup the hashtable to find the rank position of each of the domain names I am looking for.
It was actually pretty easy, the only difficult part was getting the regex(s) correct – Regex is evil, as evil as Perl….

Here is the script code :

  $domainNames = "google.com", "live.com", "bing.com", "yahoo.com"
  $maxResult = 100
  $searchTerm = "search"

  $urlPattern = "<s*as*[^>]*?hrefs*=s*[`"']*([^`"'>]+)[^>]*?>"
  $hitPattern = "<s*(h3)sclass=r>(.*?)</1>"

  $wc = new-object "System.Net.WebClient"
  $urlRegex = New-Object System.Text.RegularExpressions.Regex $urlPattern
  $hitRegex = New-Object System.Text.RegularExpressions.Regex $hitPattern
  $urls = @{}

  $resultsIndex = 0
  $count = 1
  while($resultsIndex -lt $maxResults)
  {
    $inputText = $wc.DownloadString("http://www.google.com/search?q=$searchTerm&start=$resultsIndex")

    "Parsing : " + $resultsIndex

    $index = 0
    while($index -lt $inputText.Length)
    {
      $match = $hitRegex.Match($inputText, $index)
      if($match.Success -and $match.Length -gt 0)
      {
        $urlMatch = $urlRegex.Match($match.Value.ToString())
        if(($urlMatch.Success) -and ($urlMatch.Length -gt 0))
        {
          $newKey = $urlMatch.Groups[1].Value.ToString()
          if(!$urls.ContainsKey($newKey))
          {
            $urls.Add($newkey, $count)
          }
          $count++
        }
        $index = $match.Index + $match.Length
      }
      else
      {
        $index = $inputText.Length
      }
    }
    $resultsIndex += 10
  }


  foreach($domain in $domainNames)
  {
    $maxPos = -1
    foreach($key in $urls.Keys)
    {
      if($key.Contains($domain))
      {
        $pos = [int] $urls[$key]
        if(($pos -lt $maxPos) -or ($maxPos = -1))
        {
          $maxPos = $pos
        }
      }
    }
    if($maxPos -eq -1)
    {
      $domain + " : Not Found"
    }
    else
    {
      $domain + " : Found at result #" + $maxPos
    }
  }

Drop me a line in the comments if you find it useful…

GEO 51.4043197631836:-1.28760504722595

Remote Control via Twitter

twitter_logo Twitter is one of those applications / services that I’ve had trouble getting to grips with. For me it seems it’s like shouting about what you are doing right now to a huge audience that is not listening. Who really cares that @kjhughes is heading to the shops to get some Mint sauce ?? Maybe I’m just not that kind of social animal, maybe I don’t have enough friends using it, maybe I should ‘but in’ to other peoples conversations more, maybe I’m just plain boring…

I do see a use for it though (for me). It’s a pretty neat way to do some remote control stuff – like set up a Media Center recording, reboot my PC, and also it’s a neat way of getting updates, like new blog comment received, TV recording completed and the like….

So, right now I need to get my Outlook addin project completed, but right afterwards I’m planning an app that interfaces to twitter and accepts direct tweets as remote control instructions, and also can update me on specific events. I am also thinking about adding twitter alerts to dasBlog (on comments, posts, errors, daily reports etc)

I can see myself getting immersed in this twitter thing…

GEO 51.4043197631836:-1.28760504722595

Online retailers – listen up

I buy a lot of stuff from online retailers and it never fails to amaze me how bad some aspect of the experience is always poor, seldom do I get a good experience end to end (purchase to delivery).

So, this is Kens 4 step plan for online retailers to raise themselves from ‘one of the crowd’ to ‘the leader’.

google_checkout Step 1 – Get Credibility
The term ‘ID fraud’ is on everyone’s lips right now. You have to give people the confidence to buy from you and that you are not a fly by night outfit that is going to take their money and run.
paypal_logoHow do you get credibility ? get some testimonials, use PayPal as an option for paying, use Google Checkout as an option, at least show me a valid SSL certificate and give some words around how safe it is to do business with you – most people don’t care (really) about the technical aspects of payment security they just want to feel safe – Google feels safe, PayPal feels safe, a statement telling me I’m safe feels safe.

Step 2 – Have Stuff in Stock (or Tell Me If You Don’t)
out_of-stockThis is important, there are a thousand other people out there selling that very same item. if you don’t have it in stock then they’ll just move on to the next one. Hand in hand with this is, “be up-front about the fact if you don’t have it in stock”. If you imply you have it in stock, someone buys and then has to wait 235 days for delivery they will most likely cancel the order (in which case you just cost yourself admin time for nothing) and you can also be sure they will not be using your services again – better to be honest and hope they come back than try to force the sale and know they’ll never come back

boxesStep 3 – Don’t Rip Me Off With Shipping / Delivery Charges
Ship at (or close to) cost. People do not appreciate being ripped off (and that is often all it can be called) with excessive shipping, handling and delivery charges. You want to charge me £8 for shipping and handling for a book that you send me in a small padded envelope, I’m going to get annoyed and look elsewhere. Everyone (and by that I mean everyone) has cottoned on to the retailers’ great money spinner idea of making ‘shipping’ a profit centre. 

Step 4 – Ship It Fast
This is my personal bug bear – when I choose to buy something, I have persuaded myself I had a ‘need’ (or more likely a ‘want’) for it. I took the plunge and ordered it from you, I’m excited about my new purchase. I do not want to wait 5 – 8 business days for delivery. Actually I see no need for this every to happen (unless you are bound by the delivery service) – as a retailer / warehouse manager you are either keeping up with orders or you are getting behind, if it’s the latter then you need more staff because things will just get worse (if you continue to sell stuff, which you probably hope you do); if it’s the former then just get the 3 day continuous backlog cleared and you can then keep up with a continuous 0 day backlog and your customers are all happy.

Do these and you’ll get my custom…

GEO 51.4043197631836:-1.28760504722595