Facebook Storage Engineer, RTP in Menlo Park, California
Facebook's mission is to give people the power to build community and bring the world closer together. Through our family of apps and services, we're building a different kind of company that connects billions of people around the world, gives them ways to share what matters most to them, and helps bring people closer together. Whether we're creating new products or helping a small business expand its reach, people at Facebook are builders at heart. Our global teams are constantly iterating, solving problems, and working together to empower people around the world to build community and connect in meaningful ways. Together, we can help people build stronger communities — we're just getting started.
Facebook is seeking a Storage Engineer to join our RTP (Release to Production) Team. Our servers and data centers are the foundation upon which our rapidly scaling infrastructure efficiently operates and our innovative services are delivered. RTP Engineers will work closely with server and Data Center Engineering, Storage H/W Design, H/W ODMs, Storage Media Vendors, and Capacity Engineering to thoroughly test and validate that our systems will be manageable at large scale. This position is full-time and located in our Menlo Park office.
Interface with external Storage H/W vendors, external Storage Media vendors, and internal hardware, mechanical, power, thermal and software engineers to understand system architecture to develop and execute the test suites for various storage architectures
Create and update out software, firmware, hardware, thermal, and mechanical test plans for various storage solutions by partnering with the appropriate subject matter expert individual or teams
Partner with the Vendor Management Team to evaluate components and provide inputs for reliability and feasibility
Execute tests according to plan, while keeping a through procedural record and data log
Maintain our automated test infrastructure, through the use of scripted languages and remotely controlled test equipment
Develop and publish test reports and communicate findings to team members using appropriate communication channels
Diagnose and drive root cause of system failures and isolate failing components and articulating the impact to application
Respond on an as-needed basis to emergencies and provided remedy for catastrophic failure events
3+ years of work experience in Storage Hardware or Software
Experience in Linux and scripting ability
3+ years working with Storage Vendors (HDD/SSD/Flash) and/or Storage Server Suppliers
Hands-on experience in changing system configurations and enabling new Hardware Add-Ons such as new RAID card and/or Flash
Experience with HDD server and storage architectures
Familiar with at least one of the following: PCIe, SCSI, SAS, SATA, iSCSI, FC, IB, or NAS
Troubleshooting and analytical skills
Familiar with lab equipment, protocol analyzers, oscilloscopes, power meters, air flow chambers
2+ years experience in 24x7 Production support at scale (e.g. - 10K storage servers and over 100K HDD)
2+ years experience scripting automation in Python or PHP or Perl.
Equal Opportunity: As part of our dedication to the diversity of our workforce, Facebook is committed to Equal Employment Opportunity without regard for race, color, national origin, ethnicity, gender, protected veteran status, disability, sexual orientation, gender identity, or religion. We are also committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, you may contact us at email@example.com or you may call us at 1+650-308-7837.