CouchDB Weekly News, April 24
Weekly CouchDB meeting – summary
- 1.6.0 release: voting for 1.6 rc3 is open; Erlang R15 and R16 are fine, but several unexpected issues for testing with Erlang R14BX were reported; any help with testing and diagnosing the issue is welcome
- rcouch merge status: the recent call for testing led to progress on COUCHDB-1994.The Windows build and some other little bugs could be fixed. Still, help with testing is very welcome!
- BigCouch merge status: large progress was made last week, next steps will be code review and testing
- BigCouch branch and the new multirepo design: how to edit individual repos and split big CouchDB.git into single and finally run the merge
Major Discussions
Error when installing CouchDB on Windows 7 (see thread)
Installation of CouchDB 1.5.1_R16B02 binary on Windows 7 lead to an issue caused by a problem with the Windows R16B02 installation file. This file has been replaced quickly, everything works now. The installation file can be downloaded here.
Compaction of a database with the same number of documents is getting slower over time (see thread)
A database containing a relatively stable number of documents (in CouchDB: "doc_count"), the documents themselves change frequently (including many insertions and deletions (CouchDB: "doc_del_count")). The person asking expected the compaction time to stay the same, but numbers showed that it took longer to compact the database over time, thus they asked why this was the case, and how it could overcome.
The reply is that compaction time in CouchDB scales with the sum of "doc_count" and "doc_del_count". As explained by Adam Kocoloski, this can currently be controlled by "1) [Purging] the deleted docs (lots of caveats about replication, potential for view index resets, etc.); 2) [Rotating] over to a new database, either by querying both or by replicating non-deleted docs to the new one. Neither one is particularly palatable. CouchDB currently keeps the tombstones around forever so that replication can always work. Making changes on that front is a pretty subtle thing but maybe not completely impossible. Also, there's a new compactor in the works that is faster and generates smaller files."
Data replication: replicating only non-deleted documents (same discussion as above; see thread)
The approach could be to either run a filtered replication or block deleted documents with a validate_doc_update function on the target database (see example function here).
How to handle terabyte databases (see thread)
This was a request about improving performance, insert time, compaction and replication for terabyte databases with billions of documents (setup: CouchDB 1.2 and CouchDB 1.4). Some approaches to handle a write-heavy workload:
- Replication: Supply parameters to allocate more resources to a given job (code example here).
- Insert time: Ensure that e.g. compaction only runs in the background and does not impact the throughput of interactive operations; this is work in process at the moment.
- Compaction: A new, significantly faster compactor that also generates smaller post-compaction files that also eliminates the exponential falloff in throughput observed by the inquirer is in the works. – This problem can be solved by BigCouch as it tries to partition databases.
- Ensure to not exceed the write capacity of the RAID (this effect is amplified for partitioned databases).
Vote: Release Apache CouchDB 1.6.0 rc3 (ongoing testing and discussion; see thread)
We encourage the whole community to download and test these release artefacts so that any critical issues can be resolved before the release is made. Everyone is free to vote on this release, so get stuck in on our dev@couchdb.apache.org mailing list! Find all release artefacts we are voting on in this list. If you want to test, please follow this test procedure. The changes since last vote round can be found here.
Releases in the CouchDB Universe
- Munin Plugin for CouchDB 0.6– including monitoring, more verbose autoconf, graph fixes, better README and more
- couchbeam 1.0.5 – a maintenance release for the CouchDB client library for Erlang applications
- couchable 0.4.0 – allows arbitrary python objects to be stored in CouchDB, while keeping the resulting CouchDB document as "natural" as possible
- cloudant 0.5.9 – asynchronous Cloudant / CouchDB Interface
- CouchDB Queue Service (cqs) 0.10.0 – CouchDB Queue Service: an Amazon SQS implementation on CouchDB
- hackney version 0.12.0 – service release including fixes and improvements of the HTTP client for Erlang applications
- hackney_lib 0.3.0 – including new features for playing with HTTP and Web protocols
- stork-odm 0.1.7 – providing a layer of document management over the CouchDB
- couch-db 1.0.0 – CouchDB client for node, with specific APIs
- changes-stream 1.0.1 – Simple module that handles getting changes from CouchDB
- couch-login 1.0.0 – a module for doing logged-in requests to a CouchDB server
- lockit-couchdb-adapter 0.4.1 – CouchDB adapter for Lockit
- no software release, but still: the Apache OpenOffice will also start their own Weekly News
Opinions
- CouchDB and Fauxton – a presentation
- CouchDB monitoring: you're doing it wro...you can do it better!
- Understanding race-induced conflicts in BigCouch
- Testing Lucene's index durability after crash or power loss
- Best Practices for CouchDB developers on Microsoft Azure (slides)
- The Mostly Erlang-Podcast, #034: Community
Use Cases, Questions and Answers
- Stack Overflow: Querying change sets in CouchDB
- Stack Overflow: CouchDB not Working on Android Emulator
- Stack Overflow: Complex objects and Geo Locations with CouchDB and GeoCouch (nodejs, nano)
- Stack Overflow: how to set group=True in couchdb-python
- Stack Overflow: Updating a Couch Document
- Stack Overflow: CouchDB having trouble querying by a specific key or value
- Stack Overflow: Symfony2 FOS UserBundle login doesn't work (CouchDB)
- Stack Overflow: XSL Transformations in CouchDB
- Personal Blog, question: couchdb having trouble querying by a specific key or value (no public answer yet)
- Personal Blog, how-to: Add a attachment to CouchDB with play framework JavaAPI
- Corporate Blog, how-to: Export & Import a Database with CouchDB
- Google Groups, question: CouchDB driver & No Tables/FIELDS?
Get involved!
If you want to get into working on CouchDB:
- rcouch Merge: Erlang hackers and CouchDB users, e need your help with testing and review of the rcouch merge. It's easy! Find the how-to in this post.
- Here's a list of beginner tickets around our currently ongoing Fauxton-implementation. If you have any questions or need help, don't hesitate to contact us in the couchdb-dev IRC room (#couchdb-dev) – Garren (garren) and Sue (deathbear) are happy to help.
- You want join us for the updates of CouchDB-Python for Python 3? Take a look at issue 231.
We'd be happy to have you!
Events
- April 26, Delhi, India: CouchDB meetup - Understanding how design documents work
- June 16, 17, San Francisco, CA: CloudantCON
Job opportunities for people with CouchDB skills
- Python / PostgreSQL / CouchDB Developer, Levallois-Perret, France
- Technical Engineer, Beaverton, OR
- For Freelancers (bidding platform): an app-building project, no location specified
- For Freelancers (bidding platform): CouchDB or Equivalent with Excel Template, no location specified
… and also in the news
- A Journey to the End of the World (of Minecraft)
- Open Source feat. Business: Open Source Enterprise (slides)
- Previously Unknown Warhol Works Discovered on Floppy Disks from 1985
Posted on behalf of Lena Reinhard.