Wednesday, August 28, 2013

GSoC Weeks 8 & 9

I've spent the last couple of weeks responding to code reviews and cleaning up my commit history. This included a significant consolidation of the #1382 patch, which can be found here. The patch is being reviewed and we have decided that it will be included in Tahoe 1.11.0. While users won't see much of a difference from the patch, file reliability will increase significantly.

We've also decided to include #1057 in 1.11.0, which will apply the servers-of-happiness test to mutable uploads. This will be a great win for end users because they won't be confused by the conflicting behavior of mutable and immutable files.

Along the same lines as #1382 and #1057, I sent some time trying to reproduce #1830, which is an issue with happiness settings. Users have reported instances in which their happiness settings are ignored and the uploader will use the default value of 7 instead. Sadly I couldn't reproduce the ticket with a unit test on trunk and I didn't find anything that looked suspicious when reading through the codebase. My unit test can be viewed on this branch. The ticket needs to be investigated some more before it is closed and if anyone is having this issue, please post something on trac so we can properly reproduce the bug.

I also wrote a patch for #671, which is to bring back the size limit option for storage nodes. This is something I have wanted ever since I started using tahoe, so it was nice to be able to scratch that itch. The patch relies on the new LeaseDB branch in order to calculate the amount of data stored on the given node, so the patch won't land until 1.12 because of a significant performance regression in LeaseDB. However, I have been testing the branch in production and it has unit tests, so if you are in need of this feature you should be okay using it.

Finally, I've spent some time improving the buildbot configuration for automated testing. My first improvement was to create a CouchDB instance at IrisCouch so that test results from the buildslaves will be uploaded to a central database. This data will be useful for tracking possible performance regressions over longer periods of time than one or two patches.




Friday, August 9, 2013

GSoC Update - Week 7

For the past couple of weeks I have been writing patches for servers-of-happiness and issues in the command line interface (cli).

  • Ticket #1057 (Github) - Alter mutable files to use servers-of-happiness: Tahoe uses the servers-of-happiness measurement for immutable files, but not mutable files. Instead mutable files use the old shares-of-happiness test, which has no concern for share distribution over multiple servers. This behavior is confusing for users because mutable uploads can succeed when it is impossible to upload immutable files. Closing this ticket will improve tahoe's usability as well as increase file redundancy for successful uploads. It is also required before I can implement the upload strategy of happiness for mutable files.
  • Ticket #2034 (Github) - Mutable file upload is sensitive to the number of servers: When developing a patch for #1057, I found a small issue in tahoe's mutable upload process. If the number of servers on the grid is greater than k + N, the query process can end prematurely due to a race condition and this will cause tahoe to incorrectly report a file unhealthy. The race condition is unlikely to appear in production because it requires little to no lag between the server query and the response, but it causes issues when testing servers of happiness with mutable files.
  • Ticket #2027 (Github) - Inconsistent "tahoe cp" behavior: In tahoe 1.10, there is a small bug in the cli that prevents users from copying a file off of the grid without specifying a file name. The bug is caused by a unicode assertion, so my patch simply converts the destination to unicode.
  • Ticket #712 (Github) - Tahoe cp -r doesn't copy the parent: When using the cli, "tahoe cp" is supposed to mimic the behavior of "cp". However, when tahoe copies a directory recursively, it does not copy the parent directory like cp does. My patches fixes this small issue by creating a different target directory for the cli.
  • Ticket #1836 (Github) - Use leasedb for share count: Right now tahoe uses a crawler to keep track of share information. This is ineffective, especially for large servers, so there is a branch in development to keep track of share information in a sqlite database. My patch removes the crawler and makes the necessary sql queries instead. Currently the branch cannot be merged with trunk because of a significant performance regression, but the core dev team has slowly been finding the bugs. I'm interested in this branch because I would really like tahoe to have a maximum size limit again.
I was also able to finish the new IRC bot for #tahoe-lafs on freenode during my free time and it has been working out great. The new bot will parse multiple ticket numbers and it will post updates from trac. You can find the source code here, and I've spun the IRC code out into another package that you can find here.