Ed Snowdon’s email service shuts down – advises not to trust physical data to US companies – what are options ?

It has been a while since we did a post and a lot has happened in that time including the explosion from Edward Snowdon and the PRISM snooping revelations. These have continued to gather momentum culminating  in the email service that Snowdon used, Lavabit, closing. The owner, Ladar Levision, said that he had to walk away to prevent becoming complicit in crimes against the American public. All very cryptic and chilling. He also had this advised that he “would  strongly recommend against anyone trusting their private data to a company with physical ties to the United States.” So what to do if you have data stored on remote servers ?

Well firstly you may not care. The data you are storing may no way be sensitive and that is the key ie. you need a strategy for how you deal with sensitive data and sharing of sensitive data so what can you do ?

1. You could consider encrypting the data that is stored on cloud servers. There are various ways to do this. There are client side tools such as BoxCryptor that do a good job of this, and there are also more enterprise type platform solutions such as CipherCloud and Storage Made Easy that enable private key encryption of data stored remotely . Both can be deployed on-premise behind the corporate firewall.

2. You could consider a different policy entirely for sharing sensitive data. On a personal basis you could use OwnCloud or even setup a RaspBerry Pi as your own personal DropBox or again you could use StorageMadeEasy to create your own business cloud for keeping sensitive data behind the firewall and encrypting remote data stored outside the firewall.

The bottom line is think about your data security, have a policy, think about how you protect sensitive data.


Understanding DNS, propagation and migration

We recently had a customer that was migrating from one DNS provider to another due to large outages from their existing supplier ie. a failure to keep their DNS services working correctly. They went ahead and migrated by changing their A Record and MX records for their domain/ sub-domains and only contacted us when they started getting outages during propagation as they suspected they must have done something wrong and they were not sure of how to check.

The best way to check this out is to use the DIG command. DIG is an acronym for Domain Information Groper. Passing a domain name to the DIG command by default displays the A record of the queried sit (the IP address).

We can use Dig to check the new nameserver are correctly returning the A record and MX records. To do this:

 Dig@<nameserver URL or IP> <DomainName to check>

If this is correct then it means that the name servers have the correct records which means when they are changed at the registrar we can assume they will be correct.

In the case of the company in question the DNS was correctly returning the new NameServer and MX records for the Domain but their local recursor was still returning the old NameServer records as propagation had not taken place.

Other recursors can be checked to identify whether propogation has taken place there i.e.:

dig @ ns <domain> would check the Verizon recursor

Others of note are:, – OpenDNS, – Google

Others can be found on the OpenNic Wiki

So in the companies case caching of the prior NameServers and the TTL (time to live) was causing the problem as the new NameServers were not completed propagating. Essentially there were two different “nameservers”, each returning different values, and, being selected randomly (due to cached ns records).

One of the things we were able to do help smooth the transition was to ensure each NameServer returned identical values by making both zones were 100% identical ie. on the original service we changed the NameServer NS records to match the new NameServer NS records. Ideally this would have been done as soon as migration occurred.

Finding disk bottlenecks on Linux

We recently had a client that had some issues with their site slowing down. They thought initially it was due to MySQL locks but this was not the case.  It was clear that the problem was with the disk. Some process was utilizing the disk. When running top we could see that the CPU wait time was 25-30%.

Also running vmstat we could see the wait time was quite high, so the question was which process was causing the issue. Interestingly doing a Google web search brings up almost no coherent posts on finding disk bottlenecks. The solution is good old iostat. That provides the information about the disk read and writes per partition but it does not tell you which process is causing the disk i/o. The later versions of linux kernel provide quite good diagnostic information about the disk i/o but this is not documented in the reasonably popular older posts on the subject of disk thrashing.

For the lastest kernel versions you can use iotop to pinpoint the process that is specifically doing the disk i/o. To do this:

1. Start iotop

2. press the left arrow twice so that the sort field is on disk write.

3. You will now be in real time mode of which process i is writing to the disk so you can see specficially

4. If you wish to get a historic view of writes to date then press ‘a’ again (just press ‘a’ one more time to switch back).

In this clients case the issue was their temp directory was on the same physical drive as their site and MySQL DB. Moving the temp diretory to a separate drive resolved the issue.


Some MongoDB and MySQL comparisons for a real world site

We recently did some tests with regards to replacing an existing MySQL implementation with MongoDB. I thought some of the tests would be interesting to share.

MySQL ver 14.12 Distrib 5.0.27, for Win32 (ia32)

MongoDB v2.0.4 for Win32 (journaling not enabled)

The test was centred around a table that has 400000 records with numbered names

The table was indexed on two fields, id and an_id

Selection from specific folder by name:

SELECT id FROM table WHERE (an_id=2 AND name=’some name_251504′);

db.files.find({an_id:1, name:’some name_255500′}, {id:1});

* no index for ‘name’


0.83 s


0.44 s


Increased records number to 800 000 (reached limit on 32bit OS for the data file size)

*Added index for ‘name’

Data files size:


238 MB


1.4 GB


Selection of files from specific folder by name pattern:

SELECT count(*) FROM table WHERE (an_id=1 AND name like ‘%ame_2%’);

db.files.find({an_id:0, fi_name:/ame_2/}, {id:1, fi_name:1}).count();

> 202 225 records found


9.69 s

0.69 s


3.62 s

1.34 s

* first run and others (match pattern changes slightly to prevent cache usage)


select count(*) from table where (id > 500000 and id < 550000);

db.files.find({id:{$gt:500000, $lt:550000}}).count()

> 50 000 records found


0.02 s


0.08 s


Delete 10 records:

delete from table where (id > 800000 and id < 800010);

db.files.remove({id:{$gt:800000, $lt:800010}});



0.13 s


0.00 s


Delete 50 000 records: 

delete from table where (id > 600000 and id < 650000);

db.files.remove({id:{$gt:600000, $lt:650000}});


5.72 s


2.00 s


Update 10 records:

UPDATE table SET name=’some new name’ WHERE (an_id=2 AND id > 200000 AND id <= 200010);

db.files.update({an_id:1, id:{$gt:200000, $lte:200010}}, {$set:{name:’some new name’}}, false, true);


0.08 s


0.02 s


Update 50 000 records:

UPDATE table SET name=’sone new name 2′ WHERE (id > 250000 AND id <= 300000);

db.files.update({id:{$gt:250000, $lte:300000}}, {$set:{name:’some new name2′}}, false, true);


10.63 s


3.54 s

Insert 50 records:


0.08 s


0.02 s


Insert 500 records:


0.13 s


0.09 s

Conclusions and other thoughts:

MongoDB has a clear  advantage on speed and this increases as more records are added.

Concerns are:

– MongoDB is not as battle tested or hardened

– The “gotcha’s (lack of our knowledge in part..)

– In MySQL data can be obtained from multiple tables with a single query whereas in mongoDB it seems multiple queries are needed to obtain data from multiple collections. Whereas there are latency advantages when dealing with a single collection these are negligible when dealing with multiple collections. Also, tuning of MySQL buffers and partitioning reduces speed advantages once again.

The conclusion was to stick with MySQL but to keep an eye on MongoDB.

DropBox is just a frontend to Amazon S3 with a killer sync feature

Musing about iCloud, the forthcoming SkyDrive integration into Windows 8, and Google Drive  got me thinking about DropBox, the company whose business model is built on charging when everyone else is starting to give large amounts of storage away for free. DropBox killer feature is their sync replication. It just works, and consumers have shown they love the simplicity of it. However Apple have replicated the simplicity of the sync, albeit only for iOS users, and Microsoft are now close to the same with Live Mesh.

DropBox store the files you give them on Amazon S3. This surprises many people who had assumed that they are stored on DropBox Servers. This means that the entire DropBox business model is beholden to Amazon Web Services. Amazing when you think about it, and highly illustrative that what DropBox really brings to the table is great software with a killer feature, but what is going to happen when every one else has that killer feature, with 10x to 20x more storage for free?

recent article had DropBox valued at 4 billion dollars . This is a valuation on a company doing revenues between 100-200 million dollars per year in which investors have poured in 257 million dollars in funding. Perhaps it’s me, but I just don’t see it. Yes, they have a gazillion subscribers but so what? In a commodised industry that struggles to convert more than 2% of the user base, why should that get me excited? But there is DropBox Teams for businesses right? Ever used it? Then try it and you won’t need me to draw a conclusion.

So what for DropBox if there is no mega IPO coming along? They turned down Mr Jobs (a mistake), so who else would be interested? What about Amazon? After all DropBox really is the ultimate sync client for Amazon S3. With Amazon now looking twards  private cloud it would same a match made in heaven. As with all good things, time will tell……

Comprehensive overview of PaaS Platforms

Looking to implement a PaaS. Wondering what product to start with or how they compare ? Well, there may not be an App for that but there is a collaborative spreadsheet.

To view the spreadsheet directly on Google Docs click here (it seems Google only supports 50 concurrent connections of a spreadsheet so if you have an issue try again later)

Are we witnessing the death of public File Sharing services ?

The decline of MegaUpload and the rumours that the FBI has another hotlist of sites to go after has left other file sharing services running for their proverbial lives, with legitimate services often deciding to remove public file sharing from their own services, despite arguments that the MegaUpload “bust” has done little to reduce internet piracy.

A list stored on the pastebin service shows the extent that the MegaUpload and MegaVideo closure has had on services:

  1. MegaUpload – Closed.
  2. – FileServe – Closing does not sell premium.
  3. – FileJungle – Deleting files. Locked in the U.S..
  4. – UploadStation – Locked in the U.S..
  5. – FileSonic – the news is arbitrary (under FBI investigation).
  6. – VideoBB – Closed! would disappear soon.
  7. – Uploaded – Banned U.S. and the FBI went after the owners who are gone.
  8. – FilePost – Deleting all material (so will leave executables, pdfs, txts)
  9. – Videoz – closed and locked in the countries affiliated with the USA.
  10. – 4shared – Deleting files with copyright and waits in line at the FBI.
  11. – MediaFire – Called to testify in the next 90 days and it will open doors pro FBI
  12. -Org torrent – could vanish with everything within 30 days “he is under criminal investigation”
  13. – Network Share mIRC – awaiting the decision of the case to continue or terminate Torrente everything.
  14. – Koshiki – operating 100% Japan will not join the SOUP / PIPA
  15. – Shienko Box – 100% working china / korea will not join the SOUP / PIPA
  16. – ShareX BR – group UOL / BOL / iG say they will join the SOUP / PIPA

For certain sites that previously failed to remove copyrighted files for long periods of time, then this is clearly illegal, and in our opinion these services should rightly be targetted. Other services that tuned their whole offering to enable users to upload copyrighted content, whilst then charging users to access the illegally obtained copyrighted content, can also have no complaints at legal intervention.

However we are more concerned about other services such as Box, DropBox etc who offer public file sharing for legitimate purposes,  and to treat these services the same as those aforementioned is clearly wrong. It is like trying to ban cars because robbers choose to use cars as getaway vehicles when robbing banks. Clearly the car manufactures did not design cars to rob banks ! Whereas the analogy may sound trite, what is concerning is that authorities may well go after every file sharing service just because they possess public file sharing features.

Dealing with MySQL issues in the Cloud: Automating restart on error

MySQL is the mainstay of most Cloud Applications (including this WordPress Blog !), however if MySQL has an issue, either through number of connections maxing out, or MySQL being locked and not available it can result in site outages. We’ve seen clients who have ended up with their SQL DB down from a couple of hours to a couple of days before they suddenly realised there was an issue.

To that end we wrote a small script that can be used to automate the restarting of MySQL in such scenarios.

The script is called mysqlrestart.sh and is listed below. You need root access to be able to use it. If you use it and  ever reboot the server you will need to login as root and run nohup ./mysqlrestart.sh  & to restart it.

Set the script to run every 30 seconds using Cron. It will then check for a number of connections and if it cannot get a connection or the number of connections is greater than the number defined (defined as 90 in the example below), it will restart mysql.



echo `date`  sqlrestart started >> run.out

while true; do

        sqlconnections=`mysql –skip-column-names -s   -e  “SHOW STATUS LIKE ‘Threads_connected'” -u root | awk ‘{print $2}’`

        #exclude myself from the number of thread connections             

        sqlconnections=$((sqlconnections – 1))

        echo `date` sqlrestart connections  $sqlconnections  >> run.out

        if [ $sqlconnections -gt $SQLCONNECTION_THRESHOLD ] || [ $sqlconnections -lt 0 ]


                echo `date` restarting mysql server $sqlconnections  >> restart.out

                service mysql restart >> restart.out 2>&1

                echo `date` restart  complete   >> restart.out


        sleep 30


Using the Power of Cloud Computing with SaaS Services

We recently started documenting the services we use on a day-to-day basis and it struck us just how much we use SaaS and Cloud Computing services. To that end we thought it would be fun / beneficial to share some of these and how and why we use them:

File Server: StorageMadeEasy

We use SkyDrive (25 GB free) in conjunction with Amazon S3 and an in-house PogoPlug installation to store files. We use the Storage Made Easy Cloud File Server to provide a unified view of all of ur files (we are trialling PogoPlug support with them). This also enables us to assign user / file permissions and governance on the consolidated information stores which is very useful. We can also create collaboration groups across the consolidated information stores for sharing with clients and where the files are stored is abstracted (to the client). We also use the service  to managea nd share files on our iPad (tablet) and Windows Phone 7 clients.


Having used several different CRM systems over time, personally we prefer Zoho. For up to 3 users it is free and adding users and services is reasonably priced. It’s also very easy to change the default templates and to use (using the HTML5 mobile App). SalesForce is synonymous with CRM systems and SaaS, but Zoho is good (cheaper) alternative.

Project Management: Basecamp

Basecamp can be a little expensive but if you manage projects and want to collaborate it is hard  to beat as it’s simplicity and easy to use web interface stand the test of time.

Source Code / Bug Tracker: BitBucket

There are many source code and bug tracking systems, many of them free but we like BitBucket. It’s free for up to 3 users and is a solid source code and bug tracker. For source code editing we use Textastic which enables us to hook into specific source files stored on SkyDrive via WebDav using the SMEStorage CloudDav feature which is enabled when the iPad app is purchased.

Analytics: Google

Google Analytics a essential for tracking website statistics. There are alternatives (such as the fantastic Piwik) but it is hard to beat for ease of use. It is not flawless, the lack of ability to track IP addresses (and therefore do a reverse DNS lookup) is frustrating for example. We use AnalyticsPro on the iPad for mobile access/tracking.

Email: Gmail

Google Apps GMail is a great email system. We’ve used it for years and only have good things to say about it. We backup our Gmail to SkyDrive using the SMEStorage Cloud File Server so that it can be indexed and searchable along with our other files, and of course also for resiliency. For offline access on our iPad’s we use a customised version of Remail that we enhanced for the iPad.

Inbound Lead Tracking: Leadlander

LeadLander enables us to track companies visiting our website, how often they visit, and the new people feature enables us to directly contact leads. It’s a great service and great value at a couple of thousand dollars per year.

Server Monitoring: Server Density & Pingdom & WatchMouse & PagerDuty

Server Density is great for monitoring thresholds on Apache, Server Processes, MySQL connections etc. We plug this into PagerDuty so we can be alerted by phone if major thresholds are breached. Server Density also provide an iOS App for push alerts. Pingdom is used an added check for server outages, and WatchMouse is used to check quality of service for times taken to load pages on a site.

MySQL Admin: PHPMyAdmin

We tend to use the command line but if we need to access MySQL graphically we use PHPMyAdmin.

Call Services: e-Receptionist and Skype and TollFreeforwarding

e-Receptionist is great for virtual teams if you want to route calls to Sales / Support etc and you are not physically in the same location. Skype of course needs no explanation, except that we back up our Skype conversations to SkyDrive using the SMEStorage Cloud File Server so they can be indexed and searchable as we do a lot of communication over Skype. TollFreeForwarding is used to give an international number and then route into the e-receptionist infrastructure.

Online Marketing: Google Adwords and BuySellAds.com and LinkedIn Ads

Google Adwords is synonymous with online marketing and can be a great sales tool if used correctly, and BuySellAds is useful to advertise services to targeted sites using banner ads. LinkedIn Ads is a great way to reach a very targeted audience and in our experience is a great way to compliment any online marketing campaign.

Invoicing Services: BlinkSale

Blinksale is a great, simple SaaS invoicing service that is low cost and very easy to use and administer. There are many others but for simplicity you can’t beat BlinkSale.

Blog: WordPress

There are many others, such as Google Sites, and Tumblr, but there are so many ways to use WordPress and so many plug-in’s that there really is nothing else to compete with it.

Social: Twitter and FriendFeed and Identi.ca and Google Plus and Hootsuite

We use a variety of social sites to push out news. There are of course so many now that it would be impossible to list them all but we covered the major ones. If you use multiple Twitter or social accounts then Hootsuite is a must.

……. and there you have it. The types of SaaS cloud services a distributed virtual company can use to run their business.

Amazon Cloud is now FISMA certified: Joins Google and Microsoft

The Amazon Cloud has now classed as being FISMA certified. FISMA is an acronym for Federal Information Security Management Act. FISMA sets security requirements for federal IT systems. and is a required certification for US federal government projects.

This is the third set of certifications Amazon has recently announced coming on top of VPC ISO 27001 certification and SAS 70 Type II certification.

The accreditation covers EC2 (Amazon Elastic Compute Cloud), S3 (Simple Storage Service), VPC (Virtual Private Cloud), and includes Amazon’s underlying infrastructure.

AWS’ accreditation covers FISMA’s low and moderate levels. This level of accreditation requires a set of security configurations and controls that includes documenting the management, operational and technical processes used in securing physical and virtual infrastructure, and a requirement for third-party audits.

Other vendors who recently announced FISMA certification recently where Google with Google Apps for Government and Microsoft with the Microsoft’s Business Productivity Online Suite among cloud services (although there was a spat between Microsoft and Google regarding these claims).

Expect to see further certifications as these are a pre-requisite of expansion into lucrative government and private sector contracts as vendors feels more comfortable choosing Cloud resources as commoditisation marches on.