-
Website
http://bret.appspot.com/ -
Original page
http://bret.appspot.com/entry/how-friendfeed-uses-mysql -
Subscribe
All Comments -
Community
-
Top Commenters
-
eas
3 comments · 4 points
-
mndoci
5 comments · 2 points
-
Jud Valeski
3 comments · 1 points
-
sospartan
2 comments · 1 points
-
Aviv Shaham
2 comments · 3 points
-
-
Popular Threads
-
OAuth WRAP support in FriendFeed for feedback - Bret Taylor's blog
23 hours ago · 4 comments
-
OAuth WRAP support in FriendFeed for feedback - Bret Taylor's blog
To me this looks like part of the tendency to move away from RDBMS for some or most of the storage needs to be able to scale well. The next natural step would be a shared nothing architecture using specialized storage engines for different needs. For instance there are different key/value stores with varying characteristics and the document-based ones should get more stable over time. One option I think will get more and more interesting in the future is using a graph database engine like http://neo4j.org/ (which BTW is the reason I joined that project) - it handles interconnected and semistructured data in a very efficient way.
Looks like a great alternative for schema-less storage, and seems to be in production at mixi.co.jp [2]
Do you evaluated this one? or have any opinion?
Excelent read! Thx!
[1] http://tokyocabinet.sourceforge.net/index.html
[2] http://www.igvita.com/2009/02/13/tokyo-cabinet-...
We're working with the folks at Mixi and Tokyo cabinet is intersting. bobd@virident.com if you'd like to connect. We're deliveirng unparralled MySQL performance there with our Virident appliance. Cheers,
Additionally, now you're locked into your development language. Note - SQL tends to outlast development languages. So long after you may decide that you want to replace your app with a new language or connect another app to your database that's written in another language - you'll still have SQL around. Odds are. But with this solution you won't be able to do that.
So, why make this unnecessary sacrifice? To speed up database maintenance? Ok, here's the thing - most other databases do not have the same maintenance limitations that MySQL has that are the root cause of all of this. So, address the root cause and you no longer need to further limit your options with this solution. For example, most database products I work with allow you to add columns, add/remove indexes, etc - *ON THE FLY*. In other words the product won't rewrite the table and essentially hold your application offline & hostage for hours. Furthermore, some products will even let you completely rearrange your storage (add/remove new disks, raw logical volumes, file systems, etc) on the fly - automatically rebalancing your data for you. While your app is fully online and your users are blissfully unaware.
Nice design, but I suggest that by addressing the root rather than the symptom you'll end up with a better solution at the end of the day.
Get a real RDBMS next time.
Also, while this design dramatically reduces the need for DDL changes - it doesn't not completely eliminate them. Inevitably, you'll eventually need to add a column, change the storage around a bit, etc. And when that happens you'll still be down for hours. And wait until you've got 200 million rows - and are down for 24 hours. Ugh.
I don't disagree with any of your observations, either. MySQL is a fine database but not the ideal tool for every job. One of the nice things about MySQL, though, is that with its lack of advanced features (like being able to modify storage and tables on the fly, optimistic locking, etc) comes a certain simplicity that makes it a very malleable tool for someone who needs a fast and reliable storage engine.
One thing that I have found, though, is that many sacred cows need to be slaughtered in the name of performance, stability and scale.
And OLAP was alive and well side-by-side with OLTP systems, and I also disagree iwth your buckets of reporting systems. Crystal / Business Obj have their uses, and aren't on some technical evolution scale. The big reporting systems are good for just that- big sets of data.
What is the point is that there is a considerable amount of functionality dependent upon a data model that is lost with this kind of approach. And if the objective is to address database manageability there are far better approaches that won't cost the loss of this functionality.
Not to say that this isn't creative and fun. Just seems to be a solution for some other problem.
I agree that there are more robust, bug-free apps out there that mirror what he's doing, but they're not open source (to my knowledge), and they're willing to rebuild them, so it's all good.
Support for OLAP is only one of many reasons to avoid abandoning the relational or dimensional model. Even if an organization were to decide that their core business activity did not warrant the kind of analysis made possible by OLAP then there is still a basket full of reasons not to go this route, including but not limited to:
1. the fact that this design does not completely eliminate the need for table alterations and outages will still be required
2. no reporting products of any kind are supported
3. no tools that leverage database metadata are supported
4. any reports, queries or views not anticipated by the application will have to be built leveraging the application layer rather than SQL. A vast number of potential requests or requirements will face additional costs or may not be feasible to implement with this kind of design.
5. although I'm a huge fan of Python I have to admit that languages come & go and this team may find it desirable to replace or augment this application with another language. That won't talk easily to their python APIs. Sure, web services, etc are options - but are also far more costly than native sql in terms of performance.
6. no database constraints are possible. I realize that some people don't use them anyway - but the simple declarative constraint language common to all database products more reliably protects against data corruption that application validation - which is more subject to coding defects and changes over time.
So, take away the argument that OLAP capability is lost and you are still left with 6 huge reasons not to go this route. And that is for a product which is not a 100% solution and is only required because the database is being hosted by a product which fails to support common dbms functionality.
One big aspect that makes this make sense for FF (and not other companies, software, or apps) is that FF is really, really simple schematically. They simply have two basic entities, the user, and the post.
What's the motivation for them to do something that works with other systems? There's no *reason* to be generic. Except, perhaps, that he can never fire his devlopers!
Only if the engine sucks.
Only if they want to extend their datastore. When they will want to add maybe some kind of reporting or aggregation of data. Most likely they will manage this by adding tables to their schema not by modifying existing ones. Business changes don't require any changes to db schema. It was one of the major pains they wanted to alleviate and they did.
>> 5. although I'm a huge fan of Python I have to admit that languages come & go and this team may find it desirable to replace or augment this application with another language. That won't talk easily to their python APIs. Sure, web services, etc are options - but are also far more costly than native sql in terms of performance.
Their db schema is so simple that you could port their data store interface to any language effortlessly (maybe they will just have to convert python pickled objects in db to something more generally readable like JSON).
So a "schema change" to add fields takes no time at all.
But yes, a schema change that transforms existing data either takes a long time, or is done on-the fly as records of a prior version are loaded and then later persisted.
In addition, DAOC (Dark Ages of Camelot - MMORPG for you non gamers), started out using MySQL... And they still use it! They find that this little "toy" handles huge amounts of traffic (think about it).
So what RDBMS do you consider better? Oracle, which requires a team of people frantically digesting a novel sized book to apply a simple patch? One that still lives in the stone age of Java and is the most expensive of all? (I think not) Or how about the M$ version, which in order to run it, you must be running at least one seat of M$ Server? (I think not) All others aren't even worth mentioning here, unless you are speaking of PostgreSQL (which is ok).
Kudos Bret for finding a workaround to your issue. Have you considered the JCR? With JackRabbit, I'm not sure about connectivity from Python, but it certainly works with Ruby. Sounds a little like you all have implemented a version of the JCR w/o a tree structure and with a MySQL backend.
Don't listen to these negative yay-hoos... the bottom line is that your system is running faster and cheaper.
The main problem is the use of MySQL in the first place. If schema changes require hours of downtime, then something else is seriously wrong there. I've used a lot of database products, and changed a lot of schemas on the fly, and I've never experienced any downtime. I've done ALTER TABLE on live production systems that were actually currently in use (as part of live upgrades) on tables with multiple billion rows, and I've never considered for a second that it might require taking the system down. That's just not done in modern RDBMS systems, nor should it ever really be necessary.
So actually 'just switching to postgress' could be overkill in their case.
Yes! The MySQL is the single worst performing DB I have ever worked with, especially considering index changes. Finding the workarounds for this can not be considered a breakthrough -- it would be much better if it was just fixed.
I do not yet see a value in having a schema-less DB2 or Sql Server (but it might be I just had no real use case). But with 20 million index changes are a breeze, especially compared to MySQL.
i liked the design myself, and i dont consider the language lock an issue - python in this case. actually, it's better to maintain your app in a single language, as well as maintaining a team that speaks and code that same language. no matter if it's python, php, etc. sql cannot be that language, because it's not its goal.
i don't really believe that it's possible to have that much of optimization and stability using other "products" without spending "a lot" of money for licenses and support. furthermore, now they seem to have total control of their system.
entity = user_id_index.get_all(datastore, user_id=binascii.a2b_hex("f48b0440ca0c4f66991c4d5f6a078eaf"))
should probably be:
entities = user_id_index.get_all(datastore, user_id=binascii.a2b_hex("f48b0440ca0c4f66991c4d5f6a078eaf"))
Since user_id_index.get_all appears to be returning a list of entities rather than a single entity.
Regards,
Charlie
The main problem with modulo based sharding is adding more shards and then moving data around accordingly. What do you recommend here for adding nodes to the datastore and keeping the service available - especially if you're trying to keep the setup cheap with minimal nodes? (obviously every setup has trade-offs, but i guess tricks where you swap slaves as temporary masters, etc..)
Basically you'd be adding complexity, and switching from a known platform with a wide range of tools, operational experience, etc. to something which is narrowly used and not particularly reliable anywhere. It'd be better to consider something like CouchDB, the persistent memcache services, etc. which at least have the advantage of being considerably more modern and designed to do exactly this - something like Tokyo Cabinet is going to be a lot easier to work with than a hairy codebase like any of the extant LDAP servers.
Does anyone know if any large sites are using LDAP in their architecture?
at the moment it will work and it should work for some hundreds or even thousands of users more... but if we ever run into scaling-issues (i dont think so) then this will probably be a topic again.
"Since our databases are all heavily sharded,
the relational features of MySQL like JOIN have
never been useful to us"
Doesn't this solution depend upon JOIN to find anything? (Maybe to be fair, "... based on more than a single property"?) Does the sharding preclude ALL multi-property operations?
geir
(F.D. - I'm from 10gen and intensely fascinated by document-ish persistence problems... we have a solution called MongoDB...)
...which is the point of sharding: You have more than one server to cater for the high performance need. (Technically speaking, you can use the Federated engine to do JOINs even across 2 separate MySQL engines, but then you would completely defeat the point of getting better performance, and probably it wouldn't even be applicable for most real-world sharded schemas anyway.)
There is an open-source 'fork' of Federated which fix a couple of the major issues called FederatedX and it is also transaction aware. In addition to that, it may be fairly trivial to patch MySQL to enable combining Federated with Partition which would make the task of sharding almost transparent.
For example, i can get them from table `entities` by:
Select `body` From `entities` Order By `added_id` Desc Limit 10
or
Select `body` From `entities` Order By `updated` Desc Limit 10
But how to use `index_user_id` to do the same thing?
Select `entity_id` From `index_user_id` Where `user_id`='asdfasdfasdf' Order By ?????? Desc Limit 10
I'm bad in English....sorry about that.....
wonder how it plays in clustered environments. We get the serialization
aspect of it via TerraCotta snapshots, but they're not transactional in the
journal rollback sense of the word.
Thanks for the suggestion.
Have you spoken to Ari Zilka re TerraCotta? It might be possible to build something thin that sits on top of TC if you are already using it.
good.
I borrowed the code from: http://www.die.net/musings/page_load_time/
Cool post.
P@
In corporate environment that is not possible. It must be possible to access the DB using standard query operations, tables must be normalized etc. Also normally there aren't that many db admins around so you can't use hacks and non standard ways because the db must be easy to maintain. So when you are deadling with db with tens or hundreds of millions of rows ms sql server and oracle are the way to go (there are few others but those are used by most).
I have seen some attempts to use mysql/postgre in large scale systems and they all have failed. Sure they can get it up and running but there are too many problems. You can get around runtime problems by writing scripts and making hacks into the db itself. That would be actually pretty easy but once you have actually maintain the db (constant changes because of business requirements) it becomes nightmare.
Eventually those systems (running mostly on linux and unix) decided to use oracle (some even ms sql server even though they didn't have any windows related software). Mysql is still in use in test/dev environments but in production it is always oracle or ms sql server. Some unix based solutions actually use only ms sql server because the dev tools and admin tools are so good. The performance is not an issue. In some cases oracle is faster and in some cases ms sql server is faster. The difference is not huge. I personally think that ms sql server (especially 2008 and 2005) are best products from MS.
I think the issue is how the system and used and if you have someone that actually understands databases... Maybe that is why "normally there aren't that many db admins around so you can't use hacks and non standard ways" in the corp environments to ensure that the databases are used correctly....
You know, right, that many or all of the page-views in 3 of the top 10 most visited sites in the US involve PHP, and that most of those pageviews also depend on mySQL in some way? You could add a fourth, Google, the #1 site, YouTube, one of their biggest non-search properties, uses mySQL as its database, and so does AdWords which means that most of their revenue depends on mySQL.
I've seen lots of people in discussions like this assert that everything would be so much better with a real/commercial RDBMS, but I pretty much never see anyone give real examples of real/commercial RDBMS's that have scaled to the levels of concurrent reads and writes that big websites need.
So when there is say a hypothetical product that cost $25k that nets millions in ROI and can be rolled out in a month or two, that's what would tend to be used.
That is not to say there is no role or use for MySQL and friends. Free goes a very long way when you are bootstrapped. It all depends on the use case and organization.
it's annoying the noise generated by people who've seem to have not worked with anything of this scope.
kudos to you Bret! I appreciate your spirit of openness and sharing.
See: http://highscalability.com/scaling-digg-and-oth...
I think 64,000 reads/sec is worth a look, and if you are already using memcached you can use this out of the box.
Interesting article. You should check out the SF mysql meetup this Monday (3/2/09) -- Brian Aker from Drizzle is speaking. Many call Drizzle MySQL Lite or Fat Free MySQL. It's a style of implementing MySQL for FAST usages. The reason folks use MySQL is not just cause it's opensource (hey there's postgres & it's not used in the same volume as MySQL). Oracle, while being quite $$ is also too slow. All those triggers, stored procs, views, & FKs are expensive.
What concerns me, as a MySQL DBA is your datatypes. You have a PK that starts with a varchar (725) which is over 244 which is not good. Yet you say you use auto-incrememnt to get the smallest PK you can (in some tables). I find too many developers use bad datatypes and datatypes are your building blocks.
I think your original schema may not have been that great if yr using large varchars as PKs.
Also -- you might wanna check into using FNVs to see if they may be faster than binaries. Just google FNV.
Interesting article & perhaps you should come to the SF meetup & present??
erin
Why is a 'PK varchar over 244' not good?
Thanks
Jorge
Furthermore with InnoDB, a long PK is a liability where secondary keys are used, as in such keys the PK value is used to reference the row.
If so, how are you dealing with those situations when you need data from multiple users on different shards, but that data needs to be sorted & paginated?
If not sharding indexes, how are you dealing with indexes getting too large? Replication? Or has there not been a situation where this has yet to be a problem?
Thanks for a great post!
Thanks again for sharing--we should all be doing more of this! js
I was wondering how to get over 1MB max value size for mecachedb user_index key. Splitting?
very enlightening post which confirms my assumption that more and more people are moving away from RDBMSs with increasing data to store and manage.
Just curious: Have you every considered using a content repository such as JCR?
Cheers!
Max
Great little article and smart idea.
Just curious as to how you folks deal with post-creation of new indicies on existing data.
(attempt to create an index on over a million entities)
Would definitely assume that the process is parallel. Is hadoop being put to use for this task? :)
What are you thoughts on read-through/write-through cache for a scaling strategy.
Have you checked these guys out.
http://tokyocabinet.sourceforge.net/spex-en.html
http://tokyocabinet.sourceforge.net/tyrantdoc/
Seeded by the engineers from mixi (japanese facebook)
Spent a few hours playing around and it is definitely interesting.
It looks to me that the "cleaner" is a very smart piece in your solution. As far as you explained it, it is an external process that populates/updates the indexes. And that means it has to read sequentially through your main tables to unpack the BLOBs extracting the field values to be placed in the index_tables and I'm wondering how have you implemented this without impacting the performance (in this case sequential access being almost the same issue as adding new indexes). I assume the sharded table is small enough (probably somewhere around 1mil records) in which case adding the index would probably perform decently enough. Or is there some secret sauce in the cleaner that I'm missing?
I really hope you'll find the time to give a bit more details about it. Thanks in advance.
Don't get me wrong: it's an interesting approach and I enjoyed reading the article (and especially the comments :) ). Just please do a follow up to this in a couple of months and tell us you experiences then.
What's the size of your typical shard (e.g. number of servers per shard)?
I'm also wondering what you're doing to offload simultaneous writes (while maintaining consistency) other than sharding. Are you using a MySQL Cluster?
I think that distributes key-value storages must be very interesting for you as some people already said. Especially ones that combine persistent storage with distributed in-memory cache.
Thanks for this super interesting write-up. Few questions came to my mind; some of them admittedly low level and boring ;-)
o Your technique of re-reading the entities after the index lookup gets rid of false positives, but how about false negatives ? What percent of the rows in the index table would you say is out of sync on average ?
o How do you assign the "id" key in the "entities" table ? Why not just use "added_id" (after making it BIGINT UNSIGNED) ?
o Why limit "body" to MEDIUMBLOB ? That essentially means that no object can be bigger than 16 megs compressed.
o Do you really run two instances of MySQL on the same host ? Your sharding example makes it look so.
/Buzz
2. We have multiple databases, so the auto_increment can't work as a global ID. Furthermore, for practical reasons, we want to know the ID before we insert into the database. We use standard UUIDs for this purpose.
3. The choice of MEDIUMBLOB was simply based on our current needs.
4. We run many more than two instances of MySQL on a variety of different hosts. The example above is extremely simplified for the purposes of this blog post. We have run multiple instances on separate disks on the same machine. As long as they have enough memory so you are not going to disk for reads, this method works to a point. In particular, it helps if you want to, e.g., make more shards than you actually need right now so you can accommodate future growth. Put a couple of shards on each machine at first, then expand to 2X the number of machines when growth warrants it.
Insert Into `entities`(`id`) Values(UNHEX(REPLACE(UUID(), '-', '')));
Also, any thoughts on extending the sharding approach to avoid the need to rebuild the indexes when you add (or remove) a shard? I did a lot of thinking in the Twitter context about how to allow for seamless allocation and de-allocation of the index shards, mostly centered around providing user (i.e., shard_on) + shard lookup tables, which is really just one level of indirection, but on the assumption that user data grows much faster than users and will need to be re-sharded much more frequently.
Thanks,
Paul Kinlan
http://fourkitchens.com/tags/materialized-views
We're running it on Drupal.org right now.
Great work!
i use mysql and now its become really slow. but im going to try this soon.
I like the idea to put all (unstructured) data into a BLOB and to use an own table for indexing. And espcially I like that you use MySQL not in the "normal" way but to create an own system on top of it.
To make it OS would be great for all websites that want to set up their own system instead of being forced to use e.g. the Google App Engine stack.
Have you looked into an XML Database, like exist, http://exist.sourceforge.net/
We've been slowly moving this way for the last few years ourselves, storing JSON and/or serialized PHP instead of columns. And likewise for the same reasons you chose - MySQL just works and we understand it.
We haven't yet begun transitioning our biggest tables, though - but now that's food for thought. We should compare notes. :)
http://books.google.com/books?id=9c-pkLaNmqoC&p...
http://code.google.com/p/schemafree/
Bret, one question, you say you store the data compressed using zlib, do you do the compression on the client before sending it to mysql, or do you leave that task to mysql using the gzcompress/decompress mysql functions?
CREATE TABLE index_stream_id_sort_time (
stream_id BINARY(16) NOT NULL,
sort_time INT NOT NULL,
entry_id BINARY(16) NOT NULL,
PRIMARY KEY (stream_id, sort_time, entry_id),
UNIQUE KEY (entry_id, stream_id)
) ENGINE=InnoDB;
We don't use any SQL joins in our current system.
Thanks for sharing this. I have a question and I hope that you have a time to answer me.
Is the sort_time value here reversed (f.e future date minus actual time) so the PK index is ordered exactly as the user stream visually, the newest activity on the top?
Thanks in advance for your reply
Nifty stuff.
Talking of latency, the load time for this page is pretty high. Probably want to make the disqus stuff expandable on demand.
Amit
The topic discussed here is mostly common in large scale and fast changing web 2.0 companies,
but there is not that cool open-source recipe yet and your way looks promising way!
and I concerned about this issue using XML support of postgreSQL[1], did you look on this so far?
[1] http://developer.postgresql.org/index.php/XML_S...
In a previous project that I worked on, I found that marshaling was faster in both directions (marshaling to string, unmarshaling to dict) and also takes less space than using pickle.
Ex. For one of the example entities above, the zlb-compressed pickled dict is about 20% larger than the marshaled version.
In [21]: len(zlib.compress(marshal.dumps(new_entity)))
Out[21]: 228
In [22]: len(zlib.compress(pickle.dumps(new_entity)))
Out[22]: 273
Do you store anything fancy in your entities that can't be marshaled?
"Python has a more primitive serialization module called marshal, but in general pickle should always be the preferred way to serialize Python objects. marshal exists primarily to support Python’s .pyc files."
That's not to say it's always better, but unless it's a bottlneck of some type you're better off going with the community standard.
How did they deal with (never before seen) scaling? By not using database servers (well, of course we had them - some big ones - but read on).
I walked into the VP of Engineering's office (a former principal at Oracle), and told him I needed to provision a production SQL database and whats-our-procedure-for-that. Without even looking up, he replied 'you don't need one'.
That was the day that I learned about the power of filesystem databases. Ever since that day, whenever I approach a data model, I think 'how can I NOT use an SQL server for this?' - and go on from there.
Over the past 10 years, that fundamental change in mindset has saved me more money, time, maintenance, and frustration than I can possibly ever quantify, and allowed me to survive traffic events that would have required 5x the hardware in a traditional SQL setup (and still fallen down).
It amazes me that there are no popular books specifically addressing the subject, yet a bazillion SQL books.
It should be remembered that end of 90's/early 00's was the period that saw MySQL gain ground in this very niche, and retain hold of it to the point that it now powers all major web properties (including, according to this article, FriendFeed).
Great stuff.
Regards,
Peter
Curiously the prime motivator for doing it here, though, is primarily for performance/scale issues which is counterintuitive given the expense of reading and writing variable-record length data. But what's truly bizarre is to see it implemented ON TOP of a relational DB! Why not a flat file system iike Berkely DB etc. MySQL is fast, but every I/O wastes time in the query optimizer/processor that doesn't exist in a flat file system. Ted Codd is rolling in his grave or ROFLing... not sure which.
Essentially you've got a gigantic hash table. You might as well use a clustering file system and map/reduce at this point. Gee, is there anything out there that could solve that problem? http://hadoop.apache.org/core/
Well, I'm sure that isn't "battle tested" enough for your engineers.
What kind of SQL syntax is this behind the scenes? What if you fetched a 1000 entity IDs, does that then produce a really large WHERE clause that looks like:
SELECT body FROM entities
WHERE id IN ('fb46c4ab-9537-4bd9-a7f1-5d7655ab1483',
'87dd9c65-2b3c-4956-8b1e-6754f647ac41',
'99bf8a44-6acd-4119-8bc3-5208652f2b8f',
'e85703ea-b074-4f0f-9eee-e410523b53c1', and so on... )
...with 1000 items in the list?
-- Josef
You have money!
Why don't you go and buy vertica? Check it out.
I'll try and run this by database folks far smarter than myself and see if we can use it.
Thanks again Bret.
it's been about 5 months you posted this article..
i'm interested to know how it is performing for you since and what kind of tweaks, and problems (if any) you've faced since then..
thanks in advance for taking the time to reply because this was a really interesting read.
kind regards,
san
If you got tableA then create tableANew with the new columns and the new index will get populated.
Copy tableA in to tableB...lock tableA..copy the rest of the data in...rename tableANew to tableA. Down for a few seconds without having to give up all the years and years and years of data integrity that are built into RDMS systems.
UUID as index of entity table is a waste. Entity table may want to use more meaningful keys to as index. Clustering and space efficiency are a few benefits to do it. For example, feed entity may have a clustered/primary key of $publisher_id+$publish_time. Having a cluster key enables database store all feeds belong to same publisher as part of the index, making certain queries much more efficient. On top of that it saves at least on index table per entity types.
Blog post: http://ray2cents.blogspot.com/2009/11/cheating-...
Given everything else the same, the simpler the better. But what about compound keys? There're situations where compound keys are necessary e.g. feeds of all fans published last month. They can either break the query into queries, each using a single index, and then merge them in code; or it can implement composite keys by adding multiple attributes to index tables. None of the two solutions seems satisfactory.
http://jamesgolick.com/2009/12/16/introducing-f...
www.be-anonymous.us.tc
Wouldn't it be easier just to save the data to the file system?
File systems are very mature, and have great replication, snapshoting and backup facilities, they are also very flexible.
- sorting items for display,
- auto incrementing ids,
- providing a way to create, retrieve, update and delete "files" over the network
The first two problems seem trivial, but the last is hard to solve with great scale and lots of concurrent users. MySQL works, so why not use it.
Still, I wonder if there really is no other way to do this, other than MySQL.
I've noticed that when people use the word "just" they're usually about to be condescending.
Also as mentioned below, you don't get indexing if you just use the filesystem.
The biggest problem is that this isn't strictly a relational / non-relational problem. It's a mix, and so this could be an appropriate solution.
I might have gone for a different solution, but it seems to work for them.
Writing a 'transaction manager' on top of MySql in application code???
A schema-less RDBMS isn't a RDBMS. Using MySql in order to put 'database developer' on one's CV isn't a good reason to use MySql.
I considered giving up and not posting, but then I saw that others had been even snarkier, so here it is. The point is: learn what RDBMS is *for* before blindly going off on a Brute Force Development rampage. Then figure out how much strength you'll need in your RDBMS, and get one that meets the criteria. MySql likely won't qualify. Tell your VC the truth. Buy DB2/Oracle/Postgres/SQLServer and the hardware you need. You'll write lots less code, have a spiffier system, and more time for your Significant Other.
Perhaps it was a mistake for them to start with MySQL, but perhaps it wasn't. It sounds like they knew its limitations and had an idea of how to get around them. You say they should "learn what a RDBMS is for" before going off on a "brute force development rampage," but remember, again, they are entrepreneurs. The goal is almost certainly not to be 67th-level DB2-nijas, the goal is more likely, "deliver an awesome service that people lost of people dig and make a nice chunk of money along the way, or have fun trying" Diving in with mySQL, the devil they knew, meant they could get down to work. A real DBMS would have required negotiating licensing and then learning all of its quirks because, lets face it, the documentation for commercial databases is never completely accurate. The docs may tell you what is supposed to happen, it doesn't always tell you what actually happens, or adequately convey when performance will go to hell. All that specialist knowledge takes time that might be better spent on the end product
You seem to be suggesting that if they'd chosen right in the beginning, they'd have done less work on their data layer now and have cleaner and more maintainable code. You might be right, at least in hindsight, but they were starting something new, the requirements were no-doubt less clear in the beginning than they are now. But maybe DB2, Oracle, Postgres or MS-SQL could have carried them to their current traffic levels and database size without looking like such a kludge. The thing is, if Friendfeed keeps growing at a healthy clip none of the real RDBMS are likely to see Friendfeed all the way through without starting loose more and more of their flawless 3rd-normal relational beauty.
Real RDBMS's just don't work at the scale of a highly successful community or even ecommerce website without resorting to all sorts of perversions to the relational model. People start partitioning them up the wazzo, joins and transactions end up floating into the application layer. Consistency comes eventually. I'd be genuinely interested if you can name one compelling counterexample. It won't be Amazon. It won't be EBay, who have relegated their relational DBs to not much more than dumb data stores, at least for transaction processing. It won't be MySpace, who have partitioned the hell out of MSSQL. It won't be Facebook, which has a highly partitioned bunch of mySQL instances and various custom services to provide eventual consistency between them. It won't be Linked-in, who've built a special service to hold their social graph, and another to serve as a robust key-value store. The only thing I've heard of recently that might fit is Salesforce.com, but they have much higher ARPUs to work with, and the ability to cleanly partition by customer. I'd certainly sleep better knowing that our dev-team has a clear path forward if our application needs to scale to meet our business goals.
Sure, Friendfeed may not be big enough for that yet, and they may never get there, but I'm pretty sure that's what they aspire to, and if they aspire to that, the right thing to do with their data layer is probably to start iterating on an architecture that will be able to scale all the way up.
I'm not defending mySQL here at all -- I don't want to touch the thing if I don't have to -- but I find it pretty difficult to accept the premise of your criticism.
But I am interested in the differences between the homemade stop-gap measure here and how enterprise databases handle it. I've seen some very fast processing done on Oracle,in live environments, I'm just not sure how it's done.
I work with rather large static databases, and it's all query optimization. We flip the DB to read/write depending on the batch manner. The tables are highly indexed, but I know in general building a "running index" like he suggests is more practical than managing an index on a 250M record table.
One comment- someone here suggested LDAP, which is a great lightweight (ha) method of storing simple user-identifying information. I've also worked in large messaging systems that used the file server instead of the DB.
That sort of like my neighbor. He knows how to cut up a chicken for dinner. He also thinks that qualifies him to be a neurosurgeon.
>> But I am interested in the differences between the homemade stop-gap measure here and how enterprise databases handle it
In the case of DB2, anyway, they have almost everything but the code itself available on their website explaining how the engine works. It helps if you've Elmasri or the like first.
>> That sort of like my neighbor. He knows how to cut up a chicken for dinner. He also thinks that qualifies him to be a neurosurgeon.
You'd still be better of alive with your neighbour drilling hole in you skull to alleviate the pressure and not dead because there wasn't neurosurgeon in vicinity to do the same thing more professionally.
Also you've missed the point. There's actually no skill or reason in throwing you organisation upside down on the quest of spending loads of cash for replacing what you mostly successfully use with REAL rdbms and hire rare and valued professionals that can cope better or worse with these enterprise behemoths.
You present opinions that can probably be associated with archetypical 'IT worker with no business skills'. Sorry for being nasty, it's just what came to my mind.
Commercial rdbms are only for stagnant organisations that are made of cash and don't need to be nearly as cost effective as web 2.0 companies usually are.
Do it Once, in One Place. Kind of like DRY.
On the other hand, until they went crazy with xml in v9, the Linux/Windows versions were spiffy. If you take the time to understand what that engine will do, you will understand that the closer to the data you take your ACID, the more efficient you'll be. DB666, for example, is highly optimized C (on Linux/windows). So your choice: layer on your own version of highly optimized C (or something nicely glacial) while somehow short-circuiting the engine's routines, or just letting the engine do the work. The advantage to vanilla MySql is that it makes no pretence, if you look closely, of being anything more than a SQL parser. So you can create an application which is a modern day monolithic COBOL/VSAM application, but with OO trappings. You've still wasted time and money.
Do you use memcached in combination with Zend or do you use XCache, APC, eAccelerator ?
I was wondering which combo among the above will give optimum performance.
I didn't know FF runs on Python. Still can you suggest a combination between memcached + (Zend or XCache or eAccelerator or APC) for optimum performance ?
I agree APC is a very common choice. But does APC gives the similar performance benefits like Zend or eAccelerator or XCache) ? Actually, I would like to know which solution is best.
Thanks
Cheers
But really, if its really important to get the very best, you should be doing your own tests with your own application. It shouldn't take too long.
I've been working on Magento (www.magentocommerce.com) over the last couple of days and it appears that XCache and APC return about the same level of performance. Surprisingly, memcache does not improve performance in this case but I am sure this is more a reflection on how the app works and not memcache itself. Unfortunately, it looks like this is just a really slow app for now (so slow it makes Rails look like a rocket).
PS. Magento uses Zend under the hood so that may be a factor.
Our application is shortlisted in Google Product Prodigy finals. For more info check :
www.google.co.in/intl/en/productprodigy/finalis...
Our product name is Flare and our team name is 2 Smart 2 Curious.
Thanks for your views
Cheers
Thanks for your shared.
I deal regularly with a single table that has (last I checked), 6 billion rows in it. That's a B there, as in 6,000,000,000 rows. I've had to modify it several times. And I never have had to take down the thing to do it. Queries for any given record run more or less instantly (under a tenth of a second for our most common query, about a quarter of a second for a least common one). One time I did have a long running query on it. I had to JOIN it to a couple other tables to get the data I needed. That run took about 3 minutes, I think.
Databases are designed to hold large amounts of data. At least, good database systems are. If you're running a major enterprise, buck up for a commercial RDBMS, because continuing to program around limitations in your free one is a waste of your valuable time. Don't get me wrong, mySQL is fantastic, but it's not truly large-scale enterprise level software quite yet. It's perfect for smaller databases doing work in a distributed fashion. But if you need to run a multi-terabyte DB, then mySQL will simply get crushed under the load.
Plus, you end up having to write all your own code in order to scale across multiple database servers. Look at WordPress.com. Those guys are literally running PHP code on all their systems that talks to different database servers depending on what data they're looking for. I shit you not: http://codex.wordpress.org/HyperDB . It's as if Load Balancing and Database Replication were not already solved problems... which they are not, if you're using mySQL. You end up having to use hacks like that. Yes, the hacks clearly work, but would you rather spend your time writing and supporting all that hacked up code, or just paying a few thousand bucks and plugging in something that simply works right off the bat? What is your development time worth anyway?
A single table with 6 billion rows, that is pretty cool, but what happens if you need to index a column that you previously didn't have indexed. Do you have to stop access to updates and selects from the table? If you get any interruption then the service is down, friendfeed are saying that they can't have the down time that is associated with potentially hours creating an index on a table.
The idea is quite innovative and looks likes that it performs well for the needs that they have - have you seen how fast the service runs - searches are near instant. Like you said a join on another table in your DB took 3 minutes, Friendfeed can't have that on their system - all queries have to be near instant and I presume not intensive on the system; imagine 2000 people running the same 3 minute query.
The principles used in this article work well in Enterprise Database's too. Trust me I am using this method now. The cool thing about it is that each table(pseudo-index) is really skinny in that there is not too much data in each row.
I am not saying that there is no place for Enterprise Databases nor am I saying that this method is suited for every problem. But it does solve a set of problems that Friendfeed are facing, that traditional ways of deal ways of doing RDBS don't scale to what they need.
I have been around the circuit long enough to know that this is a good idea and works well, but is not a solution for every problem. I don't see anyone cussing out Google for their BigTable system, which when you think about what Friendfeed and Google do they aren't too dissimilar, indexing massive amounts of data and allowing you to query it.
As for the three minute query, that was a one time special case. If I had to run it a lot, I'd add the columns I joined into the table themselves, or I'd make a continually updated view or something similar.
The other thing as well is that the couple droping a couple of "thousand" bucks on a DB you mention is normally for 1 CPU for enterprise DB's, scaling these systems using RAQ or what-ever other systems cost and they cost a lot per node. Then you add things like full text search, which when you look at Oracles offering IS just a hack on the database and has a whole host of its own problems.
All I am saying is that this method works and it works well regardless of the Enterprise DB or not. It suits Friendfeed, it is cheap, it scales well and it easy to create new indexable elements drop other indexes without affecting the live system.