In a newly published paper on the preprint server Arxiv.org, researchers at the Montreal Ethics Institute, McGill University, Carnegie Mellon, and Microsoft propose a four-pillar framework called SECure designed to quantify the environmental and social impact of .
Through techniques like compute-efficient , federated learning, and data sovereignty, the coauthors assert scientists and practitioners have the power to cut contributions to the carbon footprint while restoring trust in historically opaque systems.
Sustainability, privacy, and transparency remain under-addressed and unsolved challenges in . Last June, researchers at the University of Massachusetts at Amherst released a study estimating that the amount of power required for training and searching a given model involves the emission of roughly 626,000 pounds of carbon dioxide — equivalent to nearly 5 times the lifetime emissions of the average U.S. car. Partnerships like those pursued by DeepMind and the UK’s National Health Service conceal the true nature of systems being developed and piloted. And sensitive training data often leaks out into the public web, usually without stakeholders’ knowledge.
SECure’s first pillar, then — compute-efficient — aims to lower the computation burdens that typically make access inequitable for researchers who aren’t associated with organizations that have access to heavy compute and data processing infrastructures. It proposes the creation of a standardized metric that could be used to make quantified comparisons across hardware and software configurations, allowing people to make informed decisions in choosing one system over another.
The second pillar of SECure proposes the use of federated learning approaches as a mechanism to perform on-device training and inferencing of models. (In this context, federated learning refers to training an algorithm across decentralized devices or servers holding data samples without exchanging those samples, enabling multiple parties to build a model without liberally sharing data.) As the coauthors note, federated learning can decrease carbon impact if computations are performed where electricity is produced using clean sources. As a second-order benefit, it mitigates the risks and harm that arise from data centralization, including data breaches and privacy intrusions.
SECure’s third pillar — data sovereignty — refers to the idea of strong data ownership and affording individuals control over how their data is used, for what purposes, and for how long. It also allows users to withdraw consent if they see fit while respecting differing norms regarding ownership typically ignored in discussions around diversity and inclusion as they relate to . The coauthors point out that some indigenous perspectives on data require that data be maintained on indigenous land or used, for example, or processed in ways consistent with certain values.
“In the domain of , especially where large data sets are pooled from numerous users, the withdrawal of consent presents a major challenge,” wrote the researchers. “Specifically, there are no clear mechanisms today that allow for the removal of data traces or of the impacts of data related to a user … without requiring a retraining of the system.”[…]