Ubuntu and Privacy and how it really works now.

There have been quite a few entertaining discussions on the interwebs about Ubuntu and concerns around privacy. This topic comes and goes on a regular basis, today it has come up because Mozilla are planning on putting some fairly harmless adverts on the blank tiles of new tabs and this is being compared to the Dash search in Ubuntu. Whenever the topic is raised it tends to be a fairly heated discussion, mostly focussing on the Amazon search results in the dash, mostly calling that adverts or spyware. It is a discussion that is mostly overblown and underinformed, with so much time spent freaking out about “adverts” that the real problems have been completely missed. Lets go through a bit of history, and I will try and explain the difference between the real problems and the FUD.

Initially there was the Gnome 2 application launcher, kinda similar to the Windows start button, it is a way to run applications that you have on your computer. They are nicely categorised so you can find all the graphics related applications on your computer and see Inkscape alongside Gimp and choose what you want to run. This worked well and people were generally satisfied at this mechanism for running local applications. Then along came Unity, this introduced the launcher, a dock bar on the left that shows running applications and has the ability to pin applications so you can start them by clicking on them when they are not running. The launcher is the way to run applications that you have on your computer – but not all of them, and not categorised, just your favourite ones you have pinned to the launcher. Unity also introduced the dash. This has a different scope of functionality, I like to call it the OmniGlobalEverywhere search tool. You type stuff in and it searches in lots of places to find what it is you are looking for. This is not the same scope of functionality as the Gnome 2 application launcher, it could search for local files, videos on YouTube and other streaming services, music, photos, other things. It is an extensible search interface and you can plug in additional search things. I wrote an OpenERP plugin so I could type an invoice number and jump straight to that invoice in a browser for example. It was a pretty cool concept as a jack of all trades search interface – but it isn’t the master of the specialised job of viewing and running applications you have already got installed.

Everyone completely missed the fact that the magic privacy button for a long time did almost nothing – it was just an undocumented flag that some lenses looked at and turned themselves off. Others did not. This was a real big deal and nobody noticed because they were obsessed with calling Amazon search results adverts. Now we have all kinds of odd lenses and search queries possibly going to yelp, zotero, yahoo finance, songster, songkick, gallica, europeana, etsy, COLORlovers and other places. Have you even heard of every single one of these? Do you know they are not evil? Do you know they are financially stable enough not to close the doors and let the domain renewal lapse for someone evil to buy it? Amazon I know and trust to continue existing, I also trust them not to want searches for partial mostly irrelevant words for profiling data when they have my product purchase history. The utter junk that the dash sends is of no value to Amazon compared to everything else they have, but this doesn’t stop people banging on about that one specific, relatively harmless and pointless in equal measure lense.

Firstly the Amazon lens is nothing special, and it is perhaps the internet connected lens I am least worried about. I trust Amazon to do what I expect them to do, I am a customer so they know what I bought, sending them random strings like “calcul” and “gedi” and “eclip” does not give them valuable data. It is junk. I am much more concerned about stuff like the Europeana, jstor, grooveshark lenses which do exactly the same thing but I have no idea who those organisations are or what they do. Even things like openweathermap, sounds good, but are they really a trusted organisation?
So, back to how it works. Your query for “socks” goes to products.ubuntu.com. At that point canonical’s secret sauce server looks at your query and decides that most people who search for socks either want to know about products to buy, or applications to run. They don’t tend to click on the results from the medicines or recipes lenses when we try showing those lenses to the user. So, having decided that the shopping lens and the applications lens are reasonable ones to search in it sends the query to Amazon (being the only shop currently supported, but it is designed to support every online sock vendor in the world) and tells your computer that the applications lens is worth looking in. When it gets the results back from Amazon those go to your computer, as a bunch of json data that is very similar to the Amazon json API, Amazon at this point thinks that Canonical’s server has got cold toes and is in need of some nice warm socks. Amazon does not know you exist at this stage.

[iframe src=”http://rcm-eu.amazon-adsystem.com/e/cm?lt1=_blank&bc1=000000&IS2=1&bg1=FFFFFF&fc1=000000&lc1=0000FF&t=theopesou-21&o=2&p=8&l=as4&m=amazon&f=ifr&ref=ss_til&asins=B003QI99FK” style=”width:120px;height:240px;” scrolling=”no” marginwidth=”0″ marginheight=”0″ frameborder=”0″]

That bundle of sock related data goes to the shopping lens on your computer, which then displays the results. It does this by showing some text “stripy socks, only £5.30” and a picture, which it used to retrieve from Amazons content distribution network – O.M.G.!!! a data privacy leak. Amazon could log hits to their CDN (which I doubt they do), consolidate them globally, and figure out that it was displaying a bunch of sock pictures requested by your IP address, shortly after Canonical’s server searched for socks, so they could theoretically tie this together and infer that the reason you are staring at sock pictures is because you searched for socks via the dash search tool. So this huge and seriously concerning data privacy breach was a problem, so they fixed it. Now when you search for socks, Amazon gets CDN requests for images from products.ubuntu.com. Your computer gets the images from products.ubuntu.com (over https rather than http), it is now basically a reverse proxy for Amazon images, so that amazon is now more convinced than ever that Canonical’s server has got cold toes. As it happens, there is nothing wrong with your toes and you actually wanted to configure a socks proxy all along, and the shopping thing was a pointless overhead because when you want new socks the dash isn’t where you dash to.

There is a conversation on the technical board mailing list here https://lists.ubuntu.com/archives/technical-board/2013-October/thread.html and here https://lists.ubuntu.com/archives/technical-board/2013-November/thread.html relating to the closedness of the server side app. Having written something a bit similar myself, mine was closed for a while because it contained the Amazon API oauth keys in the source code. There really isn’t much to it on the server side. My server code is here https://github.com/AlanBell/shopping-search-provider/blob/master/server/index.php

We are supporting Code Club, and so should you!

Much has been made of the recent announcement of the Year of Code and the underwhelming interview on Newsnight of Lottie Dexter which contained some selected footage from what appears to be a class on jQuery, possibly by Code First:Girls in which coding is described twice as gobbledegook and went on to have Lottie Dexter announce that she was unable to code. This is not ideal for the director of an organisation that is supposed to inspire and promote the teaching of coding. I don’t demand a string of coding accomplishments from such a position, it is just that without a basic understanding of coding it is hard to articulate how much fun it is. Computing in schools fell apart as a subject in the mid 90s, the emphasis changed from doing programming projects and educational activities to using spreadsheets, word-processors and desktop database applications. In many schools the teaching the foundational skills of computing was replaced by Microsoft Office training. This is not the same, and something I have been concerned about for many years, it is one reason I was involved with supporting the Open Source Schools project around the time of the end of BECTA and one reason why we exhibited at BETT and introduced teachers to the OLPC project and the thinking behind it. A couple of years ago when taking my eldest to an open day at a local secondary school the first words out of the mouth of the teacher when we got to the ICT room were “Don’t worry, there is no coding in this subject”. We selected a different school.

This is all quite sad, but it is fixable. Coding is fun and easy, teaching it is fun and easy. I know this because I do it. Every Tuesday afternoon this term I am visiting a school a few miles away to run an after school Code Club. We are doing programming projects using Scratch, here is the project we did this week, it is a fruit machine that cycles through a few images and you click the images to stop and try to get them all to line up.

[iframe allowtransparency=”true” width=”485″ height=”402″ src=”http://scratch.mit.edu/projects/embed/17799833/?autostart=false” frameborder=”0″ allowfullscreen]

Part of the code required to do this looks like this:

code

It is programmed by dragging and dropping the commands from a palette of options (which is particularly great on an interactive whiteboard), no typing or spelling errors involved and the club of year 5 (age 9) programmers now know about variables, random numbers, if statements, infinite loops, bounded loops, signals, events and designing a fun game by balancing parameters to make it not too easy and not too hard. They have been trying things out, experimenting, getting things wrong and figuring out what the problem is and what they need to do in order to get the outcome they want. This is computing and it is the foundation of the skills we want coming into the industry.

I would encourage everyone in the IT industry, or with an interest in IT in the UK (and elsewhere, but some of this is UK specific) to get involved in Code Club . The Code Club website allows schools to say that they would like to have a Code Club, and volunteers to search for schools in their area that want one. This means that you do not have to approach the school and start by explaining what it is all about and why they should want to have a Code Club. They already know that bit, it means you have to do nothing to “sell” the concept to the school. The activity plans are great, the coders love them and you don’t have to decide what you are going to do each week, that is all done for you. There is a bit of admin and checking that is done in advance, you get a security check called a DBS, but that is all arranged and paid for by STEMNET.

I don’t know if the Year of Code organisation will make any particular contribution itself, but the Newsnight appearance and subsequent kerfuffle has certainly brought some attention on the efforts of Code Club, Young Rewired State, the Raspberry Pi foundation and some other organisations which are actively working to bring the fun of coding back into UK schools and this is a good thing.

Cluster update

I am  delighted to say that the Raspberry Pi cluster project is now fully funded to the first target of £2,500, this means that the Indiegogo fees will be 4% of the total rather than the 9% which applies to partly funded flexible campaigns. The money received by Paypal has already partially cleared, so we have been out spending some of it, here is a collection of Raspberry Pi units doing some load testing.

Initial testing

There are many ways to build a cluster and many decisions to take along the way, like how to power them, what SD cards to use, whether to overclock them, how to do networking, how to fix them together etc. I will try to explain some of the reasons behind what we are doing and what we tried and didn’t like so much.

Powering the Pis

The first two criteria for powering the cluster was that it must be safe, and it must look safe. These are not the same thing at all, it is quite easy to have something with bare wires all over the place that looks a bit scary, but is entirely safe. It is also possible to have it looking great, but overloading some components and generating too much heat in the wrong place and build something that is a good looking fire risk. A single large transformer was one approach, difficulties would be handling the connection from 20A cable or rail (basically like mains flex, the current decides the wire gauge, not the voltage) down to MicroUSB, most electronics components like a USB socket or stripboard are rated for 2.5A max so we would end up with chunky mains grade connectors all over the place, which looks scary, even if it is entirely safe. After a bit of experimentation we found a D-Link 7 port USB hub with a 3A power supply and decided to see how many Raspberry Pi  devices we could power from it, turns out that it can do all 7, which was a bit of a surprise. We know the Pi should be able to draw 700mA for a reliable supply, but that is when it has two 100mA USB peripherals plugged into it and is running the CPU and GPU flat out. As we are not using the USB ports and we won’t be using the GPU at all, our little Pi units only draw about 400mA each. This simplifies the power setup a lot, we just need several of these high powered hubs giving us a neat, safe and safe looking setup. The power supply for the hub does get a little warm, but I have tested the current draw at the plug and we are not exceeding the rated supply.

Networking

Initially I wanted to find out if we could do it all with WiFi. This would cut out the wires, would give us a decent theoretical peak speed and could in theory be supported by a single wifi router. After testing Pi compatible Wireless N dongles the performance just wasn’t there, the max we could get was 20Mbit/sec, whilst with wired networking 74Mbit/sec was achievable. I am not sure if this was a limitation of the USB system or the drivers, but it became clear that wired networking would be significantly quicker. Having decided that wires are the way forward it came to choosing switches. One big switch or lots of little ones? Well price/performance ratio of the small home switches is just unbeatable. We settled on some TP-Link 8 port gigabit switches. Obviously the Pi would only be connecting at 100Mbit (link speed) but the uplink to the backbone switch is at gigabit speeds. Choosing the 8 port switch meant that we were going to have groups of 7 Raspberry Pi units and one port for the uplink. This approach of multiple hubs has the excellent side effect that the cluster is modular. Every shelf can run as a self-contained cluster of 7 devices networked together, we then join them together using a backbone hub to make a bigger cluster.

Physical setup

Here is the first layout attempt. It uses a 30cm x 50cm shelf, with the pi units screwed to wooden dowels pushed into holes drilled in the shelf. There are holes drilled through for the network cables, which were snipped and re-crimped on the other side.

Pi On a Board

The router and power setup were screwed to the underside of the shelf. This setup was a bit fiddly to build, crimping network cables is a bit time consuming and the dowel arrangement wasn’t as neat as I wanted.

pi on the side

The Raspberry Pi doesn’t really have a flat available side to it, I was thinking of removing the composite video and audio out connectors to produce a flat side for fixing things to, then I noticed that if I drill some holes just the right size then the composite connection makes quite a reasonable push-fit fixing for a sideways mounted unit. Here is the shelving unit they are going to be fixed to, it is an IKEA Ivar set with 8 30×50 shelves. One design goal is to use easily available parts so that other people can replicate the design without sourcing obscure or expensive components. Wood is a great material for this kind of project, it is easy to cut, drill and fix things to, and it is a good thermal and electrical insulator – I wouldn’t want to accidentally put a Raspberry Pi down on a metal surface!

shelving unit

More updates will follow as the build progresses, if you have any suggestions on different approaches to any of the decisions on power/networking/fixing then do leave a comment, the design isn’t fixed in stone and we could end up changing it if a better idea comes along. Any further contributions to the campaign would also be gratefully appreciated, they will go towards filling up more shelves!

Building Ubuntu for the Raspberry Pi

As a result of the prior musings about crowdfunding and the rather shaky VAT status of the whole sector I have been thinking quite a bit about crowdfunding and where it might be useful and how we could get involved in some way. For our normal consultancy business we have no need of capital investments and we don’t produce anything that lends itself to the crowdfunding model, however I did come up with a project I have been wanting to do for quite a long time. Allow me to introduce it by way of a little video . . .


Back when the Raspberry Pi was in development it was shown running Ubuntu 9.04, Jaunty Jackalope. This was the last Ubuntu release that supported the ARMv6 instruction set, from that point on Ubuntu was optimised for newer ARM chips and would not run on the Broadcom chip that the Pi used. I am the point of contact of the Ubuntu UK Local Community team and I was dead excited about this little computer with it’s exposed PCB and low price point. I asked some of the Ubuntu ARM folk if they could support it going forward, but that wasn’t going to be possible, they didn’t have the resources to build for two ARM platforms and the bottom line was that the Pi probably wasn’t going to provide a good user experience for the increasingly heavy Ubuntu user interface. This was sad, but it was the situation. I was a bit concerned that the Raspberry Pi foundation was proceeding on the basis that Jaunty was available – it was already old, going out of support and was a dead end, there were going to be no future updates for it. I was concerned that the UK Local community was going to be landed with a lot of new users who were having a poor user experience and there would be nothing we could do about it. Reluctantly I approached the Raspberry Pi foundation (I met the lovely Liz and Eben at an event in Oxford) and shared my concerns with them, and suggested Debian was the way forward, so the Pi would have a system based on a platform Ubuntu users would be familiar with, that would get updates.

So this was sad, I wasn’t happy about it, the foundation wasn’t happy about it, many users were not happy about it, but it was much better to have a new Debian with updates and prospects than an old dead end Ubuntu release.

Moving on to the present, the Raspberry Pi is a huge success, Rasbian is a great operating platform for it, the LXDE desktop is fine, the Wayland demo was brilliant and loads of cool projects are happening based on the Pi. We still want Ubuntu on it though. We are using it in embedded projects, it is also turning up in things like the OpenERP Point of Sale kit, situations where it doesn’t need a responsive user interface (or a user interface at all). It would be great to know that all the libraries we are using on it are the same versions we are using on other computers that are running Ubuntu. It might be nice to see what the Ubuntu Unity desktop looks like on the Pi, especially Unity 8 running in Mir, but that explicitly isn’t a goal. This project aims to build everything that will build from source without too much hassle. If that gets us a desktop then great, if it gets us a command line with python, that is great too.

Now for the armchair accountants in the audience, having seen the admin end of a campaign I can explain it a little better than before. This is a flexible funding setup rather than the all-or-nothing option and we are accepting paypal and credit card pledges. The paypal pledges happen instantly, the money goes from the end user direct to our paypal account and then there is an immediate debit of 9% of the amount which goes from us to indiegogo – so the money is not held in escrow at all, and it isn’t a big payment at the end. This is fairly clearly a purchase of a pledge to the full pledge value and a subsequent payment to indiegogo which is either a purchase of campaign hosting services, or some kind of financial services fee, not sure about that bit yet. Credit card payments are slightly different, we don’t have the money for those yet, after the campaign ends Indiegogo will do a bank transfer to us for the funds (less the 4% or 9% commission presumably). Paypal is regulated as a bank now, so I think the money should turn up in our financials when it is in the paypal account, not just when we make a transfer of it to a bricks and mortar bank. We will enter all the pledges as sales and pay VAT on them and we will reclaim the VAT on the materials purchased to build the cluster. If anyone wants a VAT invoice for a paypal pledge I can sort that out. Credit card pledges are a bit more interesting as it is questionable whether they have happened yet.

If you want to contribute to the cluster and help us build Ubuntu for the Raspberry Pi then do head on over to Indiegogo and join the 40 or so other contributors we have so far.

From the technical side of things, designing the cluster feel free to pitch in your comments and suggestions below. We have had a lot of people suggesting that we don’t use the Raspberry Pi and use some other platform instead. These suggestions include: cross compile it from Intel machines, use QEMU on fast Intel computers, use cloud computing, use a Power Mac (whut!), use the OpenSUSE Build Service, Use a Calxeda box, use Pandaboards, use Wandboard quad core arm boards. Feel free to add to the list of other platforms we should be using instead, I think I will add the yet to be delivered Parallela board to the list of things we should be using. All these suggestions are great, they would work and they might even be faster or easier. They just are not things I really want (apart from the Parallela which I don’t have) and I don’t think it works as a crowdfunding concept to raise funds to build it out of anything but the Raspberry Pi.

To provide power to lots of Pis there are a few approaches, Southampton University did this:

and other cluster projects have build custom 5v electronics for feeding the USB or direct to the GPIO pins. The custom supply option doesn’t work out particularly cheap and to run the whole cluster you are looking at parts of the circuit supporting a current heading towards 32 amps, which gets kinda complicated. At the moment I am leaning towards using a special powered hub, the Pihub which can cope with powering 4 Pi devices from a single slightly beefy supply. This keeps the plug count down (they will all need PAT testing at some point so I don’t want to go completely wild on plugs) and keeps everything neat and safe and fanless.

Networking is another area where there are options. WiFi sounds mad for a cluster, but is it really? The Pi Ethernet port kind of hangs off USB internally, so wouldn’t a 150Mbit USB wifi dongle be comparable to a 100Mbit ethernet? Lets solve this using science. Initial testing with iperf shows 74Mbit throughput on the ethernet between two Pi devices, over WiFi just 20Mbit. This is rather less than I would expect, maybe there is more performance that can be teased out of the wifi, or maybe the initial feeling is right and ethernet is the way forward. Maybe you have an opinion or advice in this area?

The funding campaign runs through to Christmas but as we have some of the money available already I am thinking we will probably start getting some bits fairly soon and start setting up the cluster controllers and do some power measurements and more detailed performance testing.

Crowdfunding and VAT

The trendy way to get your investment capital these days is to put together a slick video and shove your concept on Indiegogo or Kickstarter. You offer some gifts/rewards for different pledge levels and set an overall funding target and then sit back as everyone talks about it and does your advertising for you for free. Awesome stuff. It has been used for different types of project, often for bringing a bit of hardware from a prototype into production, and the pledge often includes one of the products – but there is an element of risk to it, some are delayed like the Parallela (which we funded, still waiting for our two boards) and some like The Doom That Came To Atlantic City and Clang appear to take a lot of money and deliver nothing. Some don’t meet the funding target, the most spectacular example of this was the Ubuntu Edge which managed to break the records for the most money pledged and the biggest shortfall at the same time, which is quite a clever trick. I was contemplating backing the Edge, but I certainly didn’t want to put it on my personal credit card, I wanted to put it through as a company expense – it would have been an interesting toy for us to play with. Libertus is a VAT registered company, this means we charge VAT on things we sell to our customers, and we reclaim VAT on stuff we purchase from suppliers – it is a “Value Added” tax not a sales tax. We pay tax on the value we add to the goods in the supply chain. This makes a lot of sense in a products business where you buy raw materials, do some process to them, and sell finished goods, but it also works just fine when we sell services and buy assorted bits and pieces that are not strictly raw materials. The upshot of this is that as a VAT registered business, when we buy pretty much anything, we can reclaim the 20% VAT that our supplier added in the price. So, back to crowdfunding, I asked Canonical if I would get a VAT receipt for the £430 or so that it would cost me for the phone, so I could reclaim the £71.66 VAT (or offset it against my VAT on sales, you don’t actually get money back from HMRC unless something is going very wrong in your business). The answer was no, they don’t issue VAT receipts, which kind of makes sense, sort of. It isn’t a product purchase, it isn’t an investment (there are lots of rules about what an “investment” is, and this isn’t one) it is basically an at-risk donation. So I can’t reclaim VAT on it. On the other side, is it a sale? Does the supplier have to remit VAT to their tax authority on the sale. Well, probably. You can’t just wriggle out of VAT by trading exclusively on a crowdfunding basis. Tax fiddles don’t work, they can look at the substance of what is happening even if the details are a bit dubious. If it walks like a sale, and it quacks like a sale then the tax authorities will want their slice of the party.

The other twist to this is that the major crowdfunding platforms are based outside Europe, Kickstarter is in New York,  Indiegogo in San Francisco. The USA has state level sales taxes, and no VAT. The platforms are a party to the sale, you pay your money into their account, it is held in escrow for a bit, then released to the project with a percentage fee deduction. How does this affect the sale, am I purchasing the gift from the USA? Is there import duty now? Does this exempt it from VAT in some way or not?

This week our friends at OpenERP have launched their own crowdfunding campaign for a retail Point of Sale solution, based on our favourite little computer – the Raspberry Pi, and some other bits of hardware.

OpenERP Point of Sale

This is a cool project, I have been wanting to put together all these bits for some time, I bought a receipt printer and barcode scanner for development/demo purposes, but I don’t have a cash register and I have not had time to write the ESC/POS driver for the printer. This project will do the driver for the receipts properly and it assembles a set of reference hardware that can be reliably supported by OpenERP, which means we can help open up the retail sector to Free Software from the point of sale through to manufacturing, logistics, accounting and everything else. In short, this is great, I want it and it is a totally legitimate business expense for us – but I would really like to know how we account for the VAT element. Normally for a purchase from Belgium we would do reverse charge VAT, we notionally add 20% to it, then reclaim that back again, so there isn’t much net impact, but I have no idea if I need to do that on a crowdfunding pledge. Do comment if you have any thoughts on the matter!

Announcing ExceptionalEmails.com

If you are a sysadmin or developer or similar you probably get a bunch of emails from systems telling you they are doing just fine. You probably have mail rules to shove these off into some folder you never look at so you can get on with life. If one should happen to not turn up, that would be kind of interesting, but there is no email rule you can make to alert you about an email that didn’t happen. Over the last couple of weeks I have been building a system to fix that http://exceptionalemails.com. You basically shove all the emails you get at a set of special email addresses, one for each type of regular email, and set up rules saying what you expect to happen. You then get on with your life, and if an email fails to happen, or perhaps contains the wrong words (fail/error/out of disk space/etc.) then and only then we will send you an email – you only need to see the exceptions.

This is the form to set up the rules for an alert, so in this example I would set my fileserver backup schedule to email alanbell1+fileserver@exceptionalemails.com when it is done (or leave it emailing me, and set up a rule to put the mail in a folder of my email and forward the mail to alanbell1+fileserver@exceptionalemails.com)

an alert form

 

This was my first project using MongoDB as a back end and I have been really impressed by it, I have a background in NoSQL and it all made sense to me in terms of performance expectations and optimisations. I load tested it with a million emails and it was still really fast. It is running on Ubuntu server, with a user interface is written in PHP. The back end jobs that receive emails and check for alerts going overdue are written in Python.

I would be really interested in any feedback on the site, I have some plans for improving the analysis of past emails with sparklines so you can see when failures happened, and maybe fluctuations of arrival times of emails. Any other suggestions would be welcome. There is an outside chance that I might write a JuJu charm for it – and probably do a bit of a refactoring of the code to make deployment easier. One of the reasons for choosing MongoDB at the back end and a separate process to receive the emails was to allow it to scale horizontally across a bunch of servers. Based on my load testing I couldn’t hammer it hard enough to slow things down noticeably so I am not sure my grand clustering plans are going to be required.

The code is on Github, under AGPL3 and I am tracking issues there.

« Previous PageNext Page »