Is this thing on?
Engineering Productivity Update, Oct 1, 2015
October 1, 2015Posted by on
We’ve said good-bye to Q3, and are moving on to Q4. Planning for Q4 goals and deliverables is well underway; I’ll post a link to the final versions next update.
Last week, a group of 8-10 people from Engineering Productivity gathered in Toronto to discuss approaches to several aspects of developer workflow. You can look at the notes we took; next up is articulating a formal Vision and Roadmap for 2016, which incorporates both this work as well as other planning which is ongoing separately for things like MozReview and Treeherder.
Bugzilla: Support for 2FA has been enhanced.
- The automatic starring backend, along with related database changes, is now in production. In Q4 we’ll be developing a simple UI for this, and by the end of quarter, automatic starring for at least simple failures should be a reality.
- :Goma’s line highlighting and linking in the log viewer are now live. See this blog post for details.
- Jonathan French, our awesome contractor and contributor, has landed onscreen shortcuts; see this blog post. Jonathan will be moving on to other things soon, and we’ll sorely miss him!
Perfherder and Performance Automation:
- Work is underway to prototype a UI in Perfherder which can be used for performance sheriffing sans Alert Manager or Graphserver; follow bug 1201154 for more details. Separately, work has been started to allow other performance harnesses (besides Talos) submit data to Perfherder; .
- Talos on linux32 has been turned off; the machines that had been used for this are being repurposed as Windows 7 and Windows 8 test workers, in order to reduce overall wait times on those platforms.
- The dromaeo DOM Talos test has been enabled on linux64.
MozReview and Autoland: mcote posted a blog post detailing some of the rough edges in MozReview, and explaining how the team intends on tackling these. dminor blogged about the state of autoland; in short, we’re getting close to rolling out an initial implementation which will work similarly to the current “checkin-needed” mechanism, except, of course, it will be entirely automated. May you never have to worry about closed trees again!
Mobile Automation: gbrown made some additional improvements to mach commands on Android; bc has been busy with a lot of Autophone fixes and enhancements.
Firefox Automation: maja_zf has enabled MSE playback tests on trunk, running per-commit. They will go live at the next buildbot reconfig.
Developer Workflow: numerous enhancements have been made to |mach try|; see list below in the Details section. run-by-dir has been applied to mochitest-plain on most platforms, and to mochitest-chrome-opt, by kaustabh93, one of team’s contributors. This reduces test bleedthrough, a source of intermittent failures, as well as improves our ability to change job chunking without breaking tests.
Build System: gps has improved test package generation, which results in significantly faster builds – a savings of about 5 minutes per build on OSX and Windows in automation; about 90s on linux.
TaskCluster Migration: linux64 debug builds are now running, so ahal is unblocked on getting linux64 debug tests running in TaskCluster. armenzg has landed mozharness code to support running buildbot jobs via TaskCluster scheduling, via buildbot bridge.
- bug 1199087 – 2fa protection tidied up and extended beyond login
- bug 1199090 – add printable 2fa recovery codes
- as always: https://wiki.mozilla.org/BMO/Recent_Changes
- [jgraham] Autoclassification backend now working on Treeherder production
- [jgraham+mdoglio] API endpoint for autoclassification data now landed on master
- [jfrench] :Goma’s Line highlighting and line linking in Logviewer is now on master https://tojonmz.wordpress.com/2015/09/28/line-highlighting-and-line-linking-in-logviewer/ – https://bugzilla.mozilla.org/show_bug.cgi?id=1108764
- [jfrench] Onscreen keyboard shortcuts are now on master https://tojonmz.wordpress.com/2015/09/29/onscreen-keyboard-shortcuts/ – ‘part 1’ of https://bugzilla.mozilla.org/show_bug.cgi?id=1141569
- [jfrench] Over the last several weeks treeherder.css has been split into 8 separate components https://github.com/mozilla/treeherder/tree/master/ui/css for anyone who is adding new styles – https://bugzilla.mozilla.org/show_bug.cgi?id=1193804
- [emorley] Treeherder will soon stop posting bug comments for each intermittent failure. Instead OrangeFactor will post periodic summaries on bugs – see: https://groups.google.com/d/msg/mozilla.dev.tree-management/az643p0u4hs/3el7fqIDBwAJ
- [camd] Job Ingestion via Pulse Exchanges is in the final review stages. This will allow projects like Task Cluster to send JSON Schema-validated job data to Treeherder via a Pulse Exchange, rather than our APIs. It also enables developers and testers the ability to ingest production jobs from Task Cluster to their local machine. Blog post: https://cheshirecam.wordpress.com/2015/09/30/treeherder-loading-data-from-pulse/
- [jmaher] Linux32 Talos is turned off
- [jmaher] Dromaeo DOM is enabled for Linux64 Talos
- Big perfherder database refactoring landed, which paves the way to expanding the scope of the system — https://bugzilla.mozilla.org/show_bug.cgi?id=1192976
- [wlach] Prototyping UI for sheriffing performance alerts in https://bugzilla.mozilla.org/show_bug.cgi?id=1201154
- [wlach] Started on work to let other harnesses besides talos easily submit performance artifacts to perfherder. https://bugzilla.mozilla.org/show_bug.cgi?id=1175295
- armenzg – Mozharness code landed to support Buildbot Bridge test jobs
- [ahal] started work getting linux64 tests running with taskcluster
- mach cppunittest now supports Firefox for Android
- mach test commands now download host utilities for Firefox for Android
- [bc] Autophone
- Bug 1202826 – Autophone – 2015-09-09 deployment
- Bug 1202833 – Autophone – CHARGING state should not prevent Autophone shutdown/restart
- Bug 1201061 – Autophone – deploy robocop_adobe_flash.html
- Bug 1196115 – Intermittent Crash Autophone S1S2Test beginning 2015-08-18
- Bug 1207836 – Autophone – 2015-09-23 deployment
- Bug 1205864 – Autophone – phonetest.py:Logcat collects duplicate messages
- Bug 1206954 – Autophone – better handle failures to submit results to PhoneDash
- Bug 1209796 – Autophone – next deployment (In progress)
- Bug 1205836 – Autophone – investigate orange for remote nytimes s1s2
- Bug 1208782 – Autophone – do not attempt to get response json during Treeherder submission error if response is None
- Bug 1209647 – Autophone – eliminate startup check for network connectivity
- Bug 1209651 – Autophone – do not allow logcat device error to prevent setup_job initialization
- Bug 1209653 – Autophone – after clearing logcat, specifying -b main can hang
- Bug 1209675 – Autophone – Logcat should use PhoneTest loggerdeco
- Bug 1209691 – Autophone – handle incorrect logcat dates emitted by devices.
- jmaher/wlach working to get Autophone Talos reporting results to PerfHerder
Firefox and Media Automation
- [maja_zf] MSE Video Playback buildbot jobs will be deployed to run per-commit on mozilla-inbound any day now…
- [ahal] started work on reftest using structured logging
- [ahal] consolidate mochitest + xpcshell’s StructuredLog.jsm
- [jgraham] Landed new |mach try| implementation that passes test paths rather than manifest paths; this adds support for web-platform-tests in |mach try|
- [jgraham] Added support for saving and reusing try strings in |mach try|
- [jgraham] Added Talos support to |mach try|
- [jgraham] reftest and xpcshell test harnesses now take paths to multiple test locations on the command line and expose more functionality through mach
- [jmaher] Kaustabh93 has runbydir live for mochitest-plain osx debug, and mochitest-chrome opt; All that is left is mochitest-chrome debug and linux64 ASAN e10s.
- [ato] Support for running Marionette tests using `mach try` in review
- [ekyle] Upgraded cluster to 1.7.1 (1.4.2 had known recovery issues)
- [ekyle] Added third `zone` with a full copy of data for redundancy (zone awareness on three zones does not work as expected? seems to cause instability. Looking into the problem further: https://github.com/elastic/elasticsearch/issues/13667#issuecomment-141903363)
- [ekyle] fixed problems with deep queries and deployed: We can now query subtests: http://activedata.allizom.org/tools/query.html#query_id=HyQAbwOd , turns out they are a bit of a mess, so of limited use right now.
- [ato] Defined remote end steps for Element Clear command
- [ato] Element location strategies have been outlined
- [ato] Added steps to Base64 encode screen capture results
- [ato] Because implementors have relied on prose from outdated sections, warnings were added to those sections which have yet to be redefined
- + a ton of various fixes and rewording
- [ato] findChildElement and findChildElements commands removed
- [bc] Have been keeping the system running, helping triage bugs
- [tomcat] Has been filing bugs, sent a September status report to internal set of people.
- bugs 924405/1199788 – Bugherder now uses Bugzilla’s native REST API and can use bugzilla api keys for authentication even when 2FA is enabled.
Firefox build system
- [gps] Test packaging is now drastically faster in automation. 50% reduction across all platforms. This is a 5+ minute decrease on OS X build jobs!