<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://devproj.inf.ed.ac.uk"  xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>DICE development projects - dcspaul</title>
 <link>http://devproj.inf.ed.ac.uk/project-managers/dcspaul</link>
 <description></description>
 <language>en</language>
<item>
 <title>Content Addressable Storage (CAS) Encrypted Backup </title>
 <link>http://devproj.inf.ed.ac.uk/show/130</link>
 <description>&lt;div class=&quot;field field-name-field-projectid field-type-serial field-label-inline clearfix&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;Project ID:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;130&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;field field-name-field-current-stage field-type-taxonomy-term-reference field-label-inline clearfix&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;Current stage:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;a href=&quot;/project-stages/5completed&quot; typeof=&quot;skos:Concept&quot; property=&quot;rdfs:label skos:prefLabel&quot; datatype=&quot;&quot;&gt;5_Completed&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;field field-name-field-manager field-type-taxonomy-term-reference field-label-inline clearfix&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;Manager:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;a href=&quot;/project-managers/dcspaul&quot; typeof=&quot;skos:Concept&quot; property=&quot;rdfs:label skos:prefLabel&quot; datatype=&quot;&quot;&gt;dcspaul&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;field field-name-field-unit field-type-taxonomy-term-reference field-label-inline clearfix&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;Unit:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;a href=&quot;/unit/inf-unit&quot; typeof=&quot;skos:Concept&quot; property=&quot;rdfs:label skos:prefLabel&quot; datatype=&quot;&quot;&gt;inf-unit&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;field field-name-field-what field-type-text-long field-label-above&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;What:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;p&gt;&lt;b&gt;Description: &lt;/b&gt; A backup system which uses content addressable storage (to avoid&lt;br /&gt;
storing duplicate files) and encryption.  This will build on an&lt;br /&gt;&lt;a href=&quot;http://homepages.inf.ed.ac.uk/dcspaul/publications/CASFESB.pdf&quot;&gt;existing MSc project&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Deliverables: &lt;/b&gt; Backup server (Linux), at least one client (Mac).&lt;/p&gt;
&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;field field-name-field-why field-type-text-long field-label-above&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;Why:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;p&gt;&lt;b&gt;Customer: &lt;/b&gt; All self-managed&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Case statement: &lt;/b&gt; This is Paul&#039;s &quot;pitch&quot;, verbatim:&lt;/p&gt;
&lt;p&gt;
Background:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Disk space on personal computers is increasing.&lt;br /&gt;
    Several 100Gb is currently typical.
&lt;/li&gt;&lt;li&gt; Laptops especially are now often seen as &quot;personal&quot; machines.&lt;br /&gt;
    They hold personal and work data.&lt;br /&gt;
    It is difficult (impossible?) to automatically distinguish these.
&lt;/li&gt;&lt;li&gt;Most users (for home or work use) have ad-hoc (at best) backup schemes.&lt;br /&gt;
    These usually involve only a subset of the data with irregular backup times.&lt;br /&gt;
    The security of the data is questionable - is it encrypted? where is it stored?
&lt;/li&gt;&lt;li&gt;Corporate backup schemes are often inappropriate.&lt;br /&gt;
    They can&#039;t handle the volume of data.&lt;br /&gt;
    They don&#039;t guarantee security of personal data (encryption).
&lt;/li&gt;&lt;li&gt;Home backup schemes may be convenient, but have other problems ...&lt;br /&gt;
    Eg. Time Machine is not encrypted (what if the disk is stolen?)&lt;br /&gt;
    And it is not stored off-site (what if there is a fire?)&lt;br /&gt;
    And it doesn&#039;t work well if the laptop is encrypted ...
&lt;/li&gt;&lt;li&gt;&quot;Cloud&quot;-based backup schemes are becoming popular ...&lt;br /&gt;
     Many offer encryption, but ...&lt;br /&gt;
     They are very slow for the typical volumes of data.&lt;br /&gt;
     They are proprietary and the user depends on the continued existence of the specific service.
&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;The Concept:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Using a &quot;content-addressable&quot; storage technology, it is possible to store only&lt;br /&gt;
    a single copy of files (or even parts of files) which are common to multiple users.&lt;br /&gt;
    This has significant savings when there are lots of users - eg. many of them&lt;br /&gt;
    have the same OS and application files. 
&lt;/li&gt;&lt;li&gt;Encrypting files at the client end is normally incompatible with content-addressable&lt;br /&gt;
    storage, because the same file encrypted with different keys will appear in the backup&lt;br /&gt;
    as a different file.
&lt;/li&gt;&lt;li&gt;We have an algorithm which we believe supports encrypted, shared storage. If&lt;br /&gt;
    a number of users backup to the same system, there will only be one copy of all&lt;br /&gt;
    the common files (os, applications, shared documents, etc). This saves&lt;br /&gt;
    considerably on space and (more importantly?) backup time. In addition, each user&#039;s&lt;br /&gt;
    files will only be accessible with their own key.
&lt;/li&gt;&lt;li&gt;To implement this, we would need to build a a backup server.&lt;br /&gt;
    And a client for each platform.
&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Who would use this?&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Individual users may use this at home to backup multiple family machines.
&lt;/li&gt;&lt;li&gt;Departments may use this to backup user&#039;s laptops (or desktops).
&lt;/li&gt;&lt;li&gt;Service providers may offer this as a service &quot;in the cloud&quot;.
&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;What would I like to do?&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;I currently have an MSC project implementing a prototype to explore the concept.&lt;br /&gt;
     This will not be sufficient to evaluate it for commercial use (performance etc.)
&lt;/li&gt;&lt;li&gt;It would be good to build a realistic client and server to evaluate performance.&lt;br /&gt;
    Maybe a server for Linux. Maybe a client for the Mac.&lt;br /&gt;
    Maybe the server would be open source (to encourage implementation of compatible clients).&lt;br /&gt;
    Maybe the client would be charged.
&lt;/li&gt;&lt;li&gt;It would be good to do a marketing survey&lt;br /&gt;
    Exactly what else is available. And how does the performance compare.
&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;field field-name-field-when field-type-text-long field-label-above&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;When:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;p&gt;&lt;b&gt;Status: &lt;/b&gt; &lt;/p&gt;
&lt;p&gt;&lt;b&gt;Timescales: &lt;/b&gt; Plan to start in November.&lt;br /&gt;
Programmer is employed for 5 months.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Priority: &lt;/b&gt; &lt;/p&gt;
&lt;p&gt;&lt;b&gt;Time: &lt;/b&gt; &lt;/p&gt;
&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;field field-name-field-how field-type-text-long field-label-above&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;How:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;p&gt;&lt;b&gt;Proposal: &lt;/b&gt; &lt;/p&gt;
&lt;p&gt;&lt;b&gt;Resources: &lt;/b&gt; CO involvement (Toby) during project, see Plan for further details.  Probably 1-2 weeks CO time in total.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Plan: &lt;/b&gt; Paul&#039;s plan, slightly edited for context, from email 2009-10-01:&lt;/p&gt;
&lt;p&gt;
Plan is to start in November.&lt;/p&gt;
&lt;p&gt;
My current (rough) plan is ...&lt;/p&gt;
&lt;ol&gt;&lt;li&gt;1 month or so background &amp;amp; planning -
&lt;ul&gt;&lt;li&gt;look at filesystem details, attributes, ACLS, etc ...
&lt;/li&gt;&lt;li&gt;design rough architecture
&lt;/li&gt;&lt;li&gt;think about possible optimisations (to implement now or later)
&lt;/li&gt;&lt;li&gt;think about what is needed at the the server end (cloud?)
&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;3 months head down implementation
&lt;/li&gt;&lt;li&gt;1 month testing &amp;amp; evaluation
&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;
I&#039;d really just like to keep in touch with the COs, and get a bit of practical help.&lt;br /&gt;
So what I think I&#039;d want from you is ..&lt;/p&gt;
&lt;p&gt;
During (1), join in the discussions on the design - say one or two brainstorming&lt;br /&gt;
meetings a week &amp;amp; some email.&lt;/p&gt;
&lt;p&gt;
During (2), do a bit of testing/evaluation of the code. Maybe set up anything&lt;br /&gt;
simple that is needed at the server end - probably just filespace, but maybe&lt;br /&gt;
a few simple CGIs or something.&lt;/p&gt;
&lt;p&gt;
During (3) do some testing and perhaps find/persuade some people to try it&lt;br /&gt;
out for an evaluation of the performance.&lt;/p&gt;
&lt;p&gt;
So ... I guess that the larger time commitment would come at the beginning and&lt;br /&gt;
the end.&lt;/p&gt;
&lt;p&gt;
Of course the money is tight so there won&#039;t be any flexibility in the timing, because&lt;br /&gt;
we will only have the programmer for a fixed 5 months ...&lt;/p&gt;
&lt;p&gt;
Ultimately, there may be a practical benefit for Informatics if it&lt;br /&gt;
turns into something useable. If not, I guess we all learn something&lt;br /&gt;
at least ...&lt;/p&gt;
&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class=&quot;field field-name-field-other field-type-text-long field-label-above&quot;&gt;&lt;div class=&quot;field-label&quot;&gt;Other:&amp;nbsp;&lt;/div&gt;&lt;div class=&quot;field-items&quot;&gt;&lt;div class=&quot;field-item even&quot;&gt;&lt;p&gt;&lt;b&gt;Dependencies: &lt;/b&gt; None&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Risks: &lt;/b&gt; &lt;/p&gt;
&lt;p&gt;&lt;b&gt;Milestones&lt;/b&gt;&lt;/p&gt;
&lt;table&gt;&lt;th&gt;Proposed date&lt;/th&gt;
&lt;th&gt;Achieved date&lt;/th&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/table&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</description>
 <pubDate>Fri, 25 Jan 2013 15:45:58 +0000</pubDate>
 <dc:creator>boss</dc:creator>
 <guid isPermaLink="false">1983 at http://devproj.inf.ed.ac.uk</guid>
</item>
</channel>
</rss>
