[Most Recent Entries]
Below are the 20 most recent journal entries recorded in
[ << Previous 20 ]
[ << Previous 20 ]
|Wednesday, June 29th, 2005|
|Monday, June 27th, 2005|
|Saturday, June 25th, 2005|
|Friday, June 24th, 2005|
|API for developing plugins released
I've released an open source version of CiteULike's plugin interface (the code which scrapes citation details from external sites) under the BSD license. It's available to browse here:http://svn.citeulike.org/svn/plugins/
and the HOWTO which provides the documentation as it stands at the moment is here:http://svn.citeulike.org/svn/plugins/HOWTO.txt
If you want to pull the source code down and start hacking it, you can get it using subversion (http://subversion.tigris.org/
) by typing:
svn co http://svn.citeulike.org/svn/ citeulike
The relevant features are:
- Language neutral. You can write plugins in whatever programming language you like (assuming it can run on my server).
- Sample code. I've tidied up a few of the existing plugins and released them as part of the new system. Ultimately I'll convert them all, but it's now at a state where it makes sense to release it and get everyone hacking on it.
- Proper documentation. It walks you through all the steps required to produce a new plugin.
- Test suite. You can (and should) write test cases for your scraper, and we'll know when the site changes its format and breaks the scraper (such is the nature of writing these things).
- Test harness. You can actually run the tests against your code without having to guess whether they'd work or not (which was the case up until now if you wanted to write a plugin).
- Common utility functionality (Author names, RIS, and BibTeX parsing) build into the "driver" part of the code, so you don't need to re-invent the wheel.
It's live on the server now, as is the first user submitted plugin (from Diwaker Gupta) to scrape the proceedings from the computer science journals on the USENIX site
|Monday, June 20th, 2005|
|Edit DOIs for your articles
Another new features lets you add a DOI (Document Object Identifier
) to your article on CiteULike if it doesn't have one and you know what it should be. Just click down to the view article page, and hit the "Edit links and DOIs..." button. You'll obviously need to be logged in, and you'll need to have the article in your collection.
Apparently some journal styles actually require
you to produce bibliographies with DOIs, so this should help if you publish in one of those.
|Sunday, June 19th, 2005|
I've added experimental support for MathSciNet
. You should be able to use the "Post to CiteULike" bookmarklet to automatically add articles from MathSciNet to your library without all that tedious mucking about copying and pasting details.
, I mean that it seems to work, but I'd be interested to hear from you at firstname.lastname@example.org
if it's not behaving. To be trendy, I should probably call it Beta
or something like that. Google seem to have recently redefined that word to mean just "new feature".
|Find your Personal PDFs more easily
I'm now back from holiday. After a grand total of sixty seven hours sitting in railway carriages in Western Russia, I'm now surprisingly keen to get back to work.
The first minor update follows on from the Personal PDF announcement
. When you upload your own copy of the PDF file, you'll be able to see a small icon in the list view to let you tell at a glance which files you've already uploaded and which ones you haven't. Previously, you had to go down to the "view article" page, which was a real pain.Update:
The same now applies to articles with notes. If an article has public notes attached to it (or it's your own article with private notes and you're logged in) then you'll get a handy icon in the list view too. It looks ugly as anything, but it's quite useful.
|Friday, May 27th, 2005|
My thanks to Marica Odagaki for producing a Japanese translation of CiteULike.
I'm always amazed by how much effort goes into translating an entire web site and how people like Marcia kindly volunteer to help out. Hopefully this will help with all the Japanese users currently use Google Translate to access CiteULike. I've no idea how well it translates into Japanese, but it has certainly produced some fairly interesting Japanese->English ones when I've used it.
|Monday, April 25th, 2005|
Do you keep copies of the PDFs of some of your articles on your hard drive? Can you never find the right one when you want to? Or, do you keep these files on a machine at work and sometimes want access to them at home?
Now you can keep a personal copy of the PDF on the CiteULike server. When you're logged in, just navigate down to any article with your web browser and you'll get the option to upload the PDF from your hard drive. After that, you'll have access to the content wherever you can log in to CiteULike.
There are obviously some restrictions to comply with copyright law. In order to prove that you have access rights to the content, you must be able to upload the PDF file directly into your account. That's to say, you must prove that you had a copy of the article in the first place - CiteULike will not be able to get hold if it in any other way.
Secondly, rather obviously, you must be logged in to your CiteULike account to download the article again when you, say, want to read it at home. That's the "personal" aspect. Think of this service as (a more organised and convenient) extension of your local hard drive. You can keep your content on it, but you can't use it to share that content with anyone else.
All data is backed up to a remote site and, in the initial testing phase, there are no usage quotas. I'll clearly need to impose some sort of limit at some point, but I've got plenty of disk space and I want to get an idea of popularity so I can work out where I should peg the limit.
Coming next (once I'm happy I've set the usage limits correctly) is the ability to do a full text search on all the PDFs in your collection. This will hopefully solve the "where did I read that?" problem - all you'll need to do is ask CiteULike to search through all the papers you've ever read and tell you exactly where you read it.
|Monday, March 21st, 2005|
I've rewritten a substantial chunk of what goes on behind the scenes on the server. You shouldn't notice any difference except from:
- The "delete article" button has moved from the list page to the article page itself.
- If two people post the same article in rapid succession, you only see the latter as opposed to filling the page up with multiple instances of it.
- page generation is 10 times faster than before (for some of the more commonly viewed pages).
The chances of this all working entirely smoothly are pretty slim, so please let me know if you notice anything untoward happening on the site.
The geeky explanation of what I've done is as follows:
I replaced a simple shared-memory hashtable with memcached
together with some modifications to handle dependencies in data
as well as a custom Tcl client API
The advantage of this is that I can pre-cache the HTML rendering of each article in the list (hence I moved the "delete" button so the HTML doesn't depend on whether you're logged in or not), and use memcached's method for fetching multiple objects in one request to simply suck down all the HTML fragments and then simply play them out to the webserver. Not only is this approximately 10 times faster than what I had before (a typical page with all the articles in the cache renders in 12ms), but most of the remaining time is taken up just waiting for the data to appear over the network - and this consumes next to no CPU at all. That means I've managed to get at least an extra factor of ten capacity out of the existing hardware (and there are probably still some more optimisations I can do if I need to), and I can now use multiple instances of memcached to effectively scale up capacity by adding more hardware when required.
So, I'm now in a pretty good position to cope with growth of number of users (which is exponential), and I can now get back to spending more time adding features (like finally writing that API) than worrying about this stuff. The only thing I still need to do, of course, is fix all the bugs which I've just introduced - please do let me know if you notice anything broken.
|Tuesday, March 1st, 2005|
|Sunday, February 27th, 2005|
I've added an Amazon plugin so you should be able to post books from their web site using the normal bookmarklet. You can't post any of the other things they sell (like microwave ovens, for example) as that would be extremely confusing.
|Tuesday, February 15th, 2005|
|Roundup of two minor new features...
- For the librarians: Install one of Dan Chudnov's magic dynamic appropriate resolver bookmarklets and click it when you're looking at an article on CiteULike. Dan's trying to push the ideas behind this as a standard for embedding this sort of information in web pages, and I'm all in favour of that.
- For the fastidious: You've now got much more control over how you can edit the detail of your articles once they're in your library. Clicking the "Edit details..." button on the article page should let you do things that you previously couldn't do (like changing the article type, and adding abstracts), while clicking the "Edit links..." button will let you associate another URL with the article. This is handy if you eventually find a PDF of it online.
|Saturday, February 12th, 2005|
|New language - Italian
... or, at least it's new to CiteULike, even if the language itself has been about for a while (something to do with the Romans, I believe). Many thanks to Paolo Massa
for his hard work translating the site into Italian
. There's barely room for the array of flags in the top right-hand corner of the page any more - I'm amazed at the sheer number we've got now. Thanks again to everyone who's contributed to all these translations.
|Friday, February 4th, 2005|
|citeulike-discuss mailing list
It's high time we had an email discussion list for CiteULike. I've set one up, and you can subscribe here:http://www.citeulike.org/mailman/listinfo/citeulike-discuss
Feel free to use it for general discussion about the service, ideas for what you'd like CiteULike to do, and any questions or concerns you might have. If you've got a question about your account, or a specific bug report, that should go directly to me (Richard Cameron - email@example.com) and not to the list.
|Wednesday, February 2nd, 2005|
|Extremely long lists of tags - filter them!
I've added a little filter box to the list of tags on the right hand side of the pages. I find this quite useful to find stuff now my list of tags has grown to a horrendous length.
|Monday, January 31st, 2005|
|New plugin: Science
is now fully supported in CiteULike, which means you can add articles from it without having to type in the details.
|Friday, January 28th, 2005|
... and you can also post articles from IEEE Xplore
and have the details automatically appear for you.
Quick update: You can now post articles directly from the nature.com website without having to type the details in manually.
|Tuesday, January 18th, 2005|
|Post anything - not just from supported sites
I've obviously been spending too much time working on CiteULike and not enough time using it. It's only in the last few days I've discovered just how frustrating it is to get an error message when CiteULike won't let me post a particular article because it's not from a supported site.
So, I've removed this restriction. You can now post any URL you like. The only downside is that if it's not from a site that CiteULike knows about, you'll have to type in the citation details yourself.
There's a bit of history to why I decided to impose the restriction in the first place. When I wrote the site I didn't want it to become yet another social bookmarking site. I wanted it to fill a specific niche - academic articles. For it to be a useful academic resource, I reasoned, I had to make sure that users could browse about and get academic articles and not, say, details of chalets in France posted by a biochemist trying to organize his skiing holiday. That sort of thing would seriously devalue the site.
Secondly, I wanted to make sure the citation details were reasonably accurate (and spam free), so I initially decided to only take this information from serious online databases (like PubMed and Sciencedirect).
However, it's just far too annoying to be limited to a finite set of providers, so I had to make the change. There's still the issue of not polluting the quality of articles with arbitrary web pages though. To solve this, if you post a web page (or a PDF file, or whatever) from an unrecognized source, it will appear in you library, but not on the front page. Of course if you post the same article as others you'll still be able to see the "posted by x
others link" to find out who. This should hopefully be a reasonable compromise which will limit the damage that spammers can do to the site while allowing users the flexibility to actually store their entire bibliography online, and not just part of it.
Hope this helps.