{"id":19,"date":"2006-02-12T20:21:11","date_gmt":"2006-02-13T01:21:11","guid":{"rendered":"http:\/\/mattwork.potsdam.edu\/blog\/?p=19"},"modified":"2013-09-20T16:22:41","modified_gmt":"2013-09-20T20:22:41","slug":"copyfs-is-now-even-cooler","status":"publish","type":"post","link":"http:\/\/www.matthewgkeller.com\/blog\/2006\/02\/12\/copyfs-is-now-even-cooler\/","title":{"rendered":"CopyFS is now even cooler"},"content":{"rendered":"<p>[<strong>UPDATE<\/strong>: There is a <a title=\"CopyFS Update\" href=\"http:\/\/www.matthewgkeller.com\/blog\/2013\/09\/20\/copyfs-update\/\">newer version of CopyFS available<\/a>]<\/p>\n<p>Some backstory: I used to write a LOT of code in the C programming language. I used to be actively involved in numerous open-source programs that used C. Hell, I even <a title=\"Apache Server Commentary\" href=\"http:\/\/www.matthewgkeller.com\/blog\/books\/apache-server-commentary\/\">co-authored a book<\/a> that was completely about C code, written in the format of the much-loved classic <a href=\"http:\/\/www.amazon.com\/gp\/product\/1573980137\/sr=8-1\/qid=1139789960\/ref=pd_bbs_1\/102-0751171-8106528?%5Fencoding=UTF8\">Lions&#8217; Commentary on UNIX<\/a>, and available in 6 languages. That was &#8220;back then&#8221;. I&#8217;ve gotten spoiled with Perl, Java and C++ over the last half-decade and haven&#8217;t written more than trivial patches in C in &#8230; 6 years. I still read C fluently&#8230; But not so much on the writing.<\/p>\n<p>So going [almost] back to my programming roots, I took on a project that forced me to dive head-first back in C. And not just a wussy application&#8230; No sir. A filesystem. Some guys beta test backup software, and some guys use their laptop as a live debugger for filesystem enhancements.\u00c2\u00a0 Before I get into\u00c2\u00a0 the details of the software, let me say that I&#8217;ve\u00c2\u00a0 finally popped back into the world of seeing C solutions, and knowing exactly how to get what I want out of the code&#8230; I&#8217;m not at the level I&#8217;m at with Perl, where I can literally have a conversation with a person in the language itself, but my C-fu is, again, strong.<br \/>\nSo enough about me. <a href=\"http:\/\/n0x.org\/copyfs\/\">CopyFS<\/a> is a <a href=\"http:\/\/fuse.sourceforge.net\/\">FUSE<\/a> filesystem that supports file versioning. v1.0 is a pure copy-on-write filesystem. Anytime you make a change, it makes a copy. If you change the metadata of a file, it makes a copy. You can list all of the versions of a file, and make any version the &#8220;current&#8221; version very easy. It really is a great tool.<br \/>\nI&#8217;ve been using CopyFS for a while now in various venues, the most active of which is my <a href=\"http:\/\/www.eclipse.org\/\">Eclipse<\/a> development tree- Where nearly all of my Perl, Python, C++, Java&#8230; and now C code is developed. It allowed me to have my own little revision control system on my filesystem without the complicated mucking around with CVS\/SVN\/git\/etc. repositories.<br \/>\nAfter using it for the quite-a-while I have been, the lack of certain features became a little painful. As CopyFS is open source software, I could just complain that the software doesn&#8217;t do what I want, write a snotty post to a mailing list and assert the entitlement that others seem to believe they have just because they downloaded a piece of software&#8230; Or I could enhance it. I could solve my own problems.<\/p>\n<h3>Problem 1: Text Diffs<\/h3>\n<p>By far, the largest problem I was having was determining WHICH of the 212 versions of a source file were the ones I wanted to revert too. Ok, I broke something in the latest version, where&#8217;s one that doesn&#8217;t have changes to that area? That was done with no core changes- just 90ish lines of additions and a couple changed lines in the <span style=\"font-style: italic;\">fversion<\/span> userspace application- All Perl.<\/p>\n<h3>Problem 2: Way too many versions<\/h3>\n<p>As I mentioned previously, any change to a file will trigger a copy. I have some files with <span style=\"font-weight: bold;\">hundreds<\/span> of versions. For some files, this is great. For others&#8230; For example a pre-linked object file, this is unnecessary. While the sum of the code additions necessary to add this into CopyFS was a mere 200 lines, it took me <span style=\"font-weight: bold;\">thousands<\/span> of lines of C over the past few weeks to get to that 200 lines. Using <span style=\"font-style: italic;\">fversion<\/span>, you can now tell the CopyFS daemon you want to purge the oldest <span style=\"font-style: italic;\">N <\/span>versions, or all versions, of a given file.<\/p>\n<h3>In progress&#8230;<\/h3>\n<ul>\n<li><span style=\"text-decoration: line-through;\">Another piece to the problem 1 puzzle is the ability to search your versions for a string or a pattern. I&#8217;m workingish on that. I have it workingish in a pure-Perl solution, but I know it would be better to do it in C&#8230; I just love regular expressions in Perl, and know that if I wanted anything remotely that powerful in C, I would have to include <\/span><a style=\"text-decoration: line-through;\" href=\"http:\/\/www.pcre.org\/\">PCRE<\/a><span style=\"text-decoration: line-through;\"> which will bloat the project, something I&#8217;m not willing to do. It&#8217;s small and fast right now. I like it that way.<\/span> This has been implemented&#8230; <span style=\"font-style: italic;\">fversion -G pattern<\/span> will now match whatever you put at it. 100% perl.<\/li>\n<li>Another piece to problem 2 is the ability to purge individual versions, or version ranges. For example, I have a file that has 212 versions. I want to preserve v1.0 and v212.0. I want to purge 2-211. I can&#8217;t do that right now, so all 212 are there. Yes, I could lock v1.0 and then make a change to it, thus creating v213 which would essentially BE v1, and then purge 211 versions&#8230; But that&#8217;s not elegant, and I prefer to do things elegantly. I&#8217;m working on this now. It&#8217;s all C.<\/li>\n<\/ul>\n<h3>Braindump<\/h3>\n<ul>\n<li>Setting a special xattr that could mark a file as &#8220;don&#8217;t copy&#8221;&#8230; Better yet mark it with a number that is the number of copies you want to keep of this file&#8230; So if it was 1, you&#8217;d always have the current version plus 1. If it was 12&#8230; You get the point.<\/li>\n<li>Along the same line, having a mount-time option that sets the maximum number of versions kept for all files on the volume. This would be useless for my uses, but useful if you only wanted to keep versions in order to restore a misdeleted file or something.<\/li>\n<li>&#8230; same line, again. Maybe have a config file that lets you set certain file types differently: eg. all .o files (object files) never keep copies. All .pl, .cpp, .c, .java, etc keep all. Everything else keep 3.<\/li>\n<li>Web interface to allow users who don&#8217;t have shell access an easy way to restore old versions (thinking of user-servicable backups here)<\/li>\n<li>Still need to implement directory handling\/recursion for the purge.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>[UPDATE: There is a newer version of CopyFS available] Some backstory: I used to write a LOT of code in the C programming language. I used to be actively involved in numerous open-source programs that used C. Hell, I even &hellip; <a href=\"http:\/\/www.matthewgkeller.com\/blog\/2006\/02\/12\/copyfs-is-now-even-cooler\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,5,10,17],"tags":[101,102,59,71],"class_list":["post-19","post","type-post","status-publish","format-standard","hentry","category-architecture","category-general-coding","category-linuxy","category-work","tag-c","tag-copyfs","tag-linux","tag-perl"],"_links":{"self":[{"href":"http:\/\/www.matthewgkeller.com\/blog\/wp-json\/wp\/v2\/posts\/19","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.matthewgkeller.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.matthewgkeller.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.matthewgkeller.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.matthewgkeller.com\/blog\/wp-json\/wp\/v2\/comments?post=19"}],"version-history":[{"count":0,"href":"http:\/\/www.matthewgkeller.com\/blog\/wp-json\/wp\/v2\/posts\/19\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.matthewgkeller.com\/blog\/wp-json\/wp\/v2\/media?parent=19"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.matthewgkeller.com\/blog\/wp-json\/wp\/v2\/categories?post=19"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.matthewgkeller.com\/blog\/wp-json\/wp\/v2\/tags?post=19"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}