Hadoop Survey

Document Sample
Hadoop Survey Powered By Docstoc
					                                                1




                                 Stefan Groschupf

Hadoop Survey
                                 Scale Unlimited

                                 sg{at}101tec.com


result report (2008,   Sep 17)




  www.scaleunlimited.com
                                                                                  2




1. What is the size of your company?

15,0
13,5
12,0
10,5
 9,0
 7,5
 6,0
                           11
            10
 4,5
 3,0
                                    4
 1,5
                                                                            2
                                              1         1         1
  0
          1-10            11-50   51-100   101-500   501-1000 1001-5000   5001+


 www.scaleunlimited.com
                                                                 3




2. How long have you been using Hadoop (in months)?



                                 13-24 months
                                     13%
                                                    1-3 months
                                                       29%



                          7-12 months
                              33%

                                                4-6 months
                                                   25%




 www.scaleunlimited.com
                                                                 4




3. How many nodes do you have running today?

   10
     9
     8
     7
     6
     5
     4                             8
               7
     3
     2                     4
                                                   3
     1
                                           1
     0
              0-5         6-10   11-20   21-50   50-100   100+


 www.scaleunlimited.com
                                                                 5




4. How many nodes do you plan to have in 6 months?

   10
     9
     8
     7
     6
     5
     4
                                   7
     3                                     6

     2
               3           3                               3
     1
                                                   1
     0
              0-5         6-10   11-20   21-50   50-100   100+


 www.scaleunlimited.com
                                                  6



5. How many developers are using Hadoop in your
organization?


                                 10+
                                 8%


                          4-10
                          25%


                                       1-3
                                       67%




 www.scaleunlimited.com
                                                      7



6. How many system administrators are managing your
Hadoop cluster?


                             3+
                           2
                             4%
                          4%




                                   1
                                  91%




 www.scaleunlimited.com
                                                       8




7. How many Hadoop jobs do you run every day?

    10
     9
     8
     7
     6
     5
                 9
     4
     3
                           5
     2                             4       4

     1                                            2

     0
                0-5       6-10   11-20   21-50   50+


 www.scaleunlimited.com
                                                                 9




8. How much data is stored on your Hadoop cluster ?



                                 5,000+ GB     < 100 GB
                                    17%          17%




                          1,000-5,000 GB
                               29%                  100-500 GB
                                                       33%

                                     501-1,000 GB
                                          4%



 www.scaleunlimited.com
                                                             10




9. How much new data is added each month?



                                 500+ GB
                                   18%             0-10 GB
                                                     27%


                          101-500 GB
                             14%




                                           11-100 GB
                                              41%



 www.scaleunlimited.com
                                               11




10. Do you use HDFS or KFS for data storage?




    30
    25
    20
    15
                           23
    10
      5
      0
                          HDFS         KSF




 www.scaleunlimited.com
                                                                                        12




11. How many files do you have stored in HDFS/KFS?


10

 8

 6

 4
                                                               7

                                  5
 2                                             4                               4

                2

 0
             1-100            101-1,000   1001-10,000   10,001-1,000,000   1,000,001+



     www.scaleunlimited.com
                                                                     13




12. Do you use any other related tools?



                                            Cascading
                                 Lucene
                                              12%
                                  18%            Cassandra
                                                    3%


                           Pig                    Hadoop streaming
                          15%                          21%

                           Mahout
                            3%
                                          Hbase
                                           27%



 www.scaleunlimited.com
                                                              14



 13. Do you own or rent (e.g. Amazon EC2) your cluster
 nodes?




15
12
 9
                     15
 6
 3                                                      5
                                         3
 0
          We own the hardware   We rent the hardware   Both




     www.scaleunlimited.com
                                               15




14. Which java version do you primarily use?




                                   Java 5
                                    25%




                          Java 6
                           75%




 www.scaleunlimited.com
                                                          16



15. From where does your HDFS/KFS stored data originate
from?



                                  Other
                                  21%


                                                Files
                          Hbase                 47%
                           12%


                              Database
                                18%       JMS
                                           3%



 www.scaleunlimited.com
                                                         17




16. To where do you extract your HDFS/KFS data?



                                   Other
                                   11%
                          Lucene                 Files
                           11%                   36%


                          Hbase
                           14%


                                      Database
                                        27%



 www.scaleunlimited.com
                                                       18




17. Which version of Hadoop are you using currently?



                                 0.16
                                 13%


                          0.18
                          42%



                                        0.17
                                        46%




 www.scaleunlimited.com
                                                      19




Please take the survey!




http://www.scaleunlimited.com/hadoop-resources.html




 www.scaleunlimited.com