Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the logic for Energy Star #261

Draft
wants to merge 244 commits into
base: main
Choose a base branch
from
Draft

Add the logic for Energy Star #261

wants to merge 244 commits into from

Conversation

regisss
Copy link
Collaborator

@regisss regisss commented Sep 12, 2024

No description provided.

@IlyasMoutawwakil
Copy link
Member

can you try fixing the energy tracker ? it seems to be broken (or pin a codecarbon version)

Comment on lines +227 to +229
# See https://github.com/huggingface/diffusers/issues/4649
if "xl" in self.config.model:
self.pretrained_model.unet.config.addition_embed_type = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the actual problem here ? reading the conversation it seems that users have this issue when they use the wrong pipeline for their target models (stable diffusion for an xl model) which is not something we wanna hack here.

Comment on lines +12 to +34
def preprocess(
dataset: Dataset,
task: str,
config: EnergyStarConfig,
preprocessor: PretrainedProcessor,
pretrained_config: PretrainedConfig,
) -> Dataset:
task_to_preprocessing = {
"feature-extraction": feature_extraction_preprocessing,
"sentence-similarity": sentence_similarity_preprocessing,
"text-classification": text_classification_preprocessing,
"question-answering": question_answering_preprocessing,
"text-generation": text_generation_preprocessing,
"text2text-generation": text2text_generation_preprocessing,
"summarization": summarization_preprocessing,
"stable-diffusion": image_generation_preprocessing,
"automatic-speech-recognition": automatic_speech_recognition_preprocessing,
"image-to-text": image_to_text_preprocessing,
"image-classification": image_preprocessing,
"object-detection": image_preprocessing,
}

return task_to_preprocessing[task](dataset, config, preprocessor)
return task_to_preprocessing[task](dataset, config, preprocessor, pretrained_config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe using the dict directly makes more sense ? why passing by a func if it's only dispatching using a dict ?

@@ -32,7 +58,7 @@ def tokenize_function(examples):
examples[config.text_column_name],
padding=padding,
truncation=config.truncation,
max_length=config.max_length if config.max_length != -1 else None,
max_length=getattr(pretrained_config, "max_position_embeddings", 512),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using model_shapes (normalized dictionary of model variables) makes more sense here.

if getattr(tokenizer, "pad_token", None) is None:
tokenizer.pad_token = tokenizer.eos_token

padding = False if config.input_shapes["batch_size"] == 1 else True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
padding = False if config.input_shapes["batch_size"] == 1 else True
padding = config.input_shapes["batch_size"] != 1

if getattr(tokenizer, "pad_token", None) is None:
tokenizer.pad_token = tokenizer.eos_token

padding = False if config.input_shapes["batch_size"] == 1 else True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment on lines +126 to +128
# Add a pad token if the tokenizer doesn't have one
if getattr(tokenizer, "pad_token", None) is None:
tokenizer.pad_token = tokenizer.eos_token
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repeated multiple times, makes more sense to do this after tokenizer instantiation (in transformers_utils.py)

Comment on lines 70 to 76
if is_torch_distributed_available() and torch.distributed.is_initialized():
LOGGER.info("\t+ Distributing batch size across processes")
self.logger.info("\t+ Distributing batch size across processes")
if self.config.input_shapes["batch_size"] % torch.distributed.get_world_size() != 0:
raise ValueError(
"The batch size must be divisible by the number of processes in a distributed environment"
)
self.config.input_shapes["batch_size"] //= torch.distributed.get_world_size()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic is no longer needed as we do this in the backend now
https://github.com/huggingface/optimum-benchmark/blob/main/optimum_benchmark/backends/pytorch/backend.py#L410-L440
This was introduced to implement TP vs DP logic (same input vs splitting inputs).
You'll have to call these methods after the inputs have been generated and processed, because they might zero the batch_size for TP on non-main processes.
https://github.com/huggingface/optimum-benchmark/blob/main/optimum_benchmark/scenarios/inference/scenario.py#L86-L92

Comment on lines +169 to +175
context_stack = ExitStack()
if self.config.energy:
context_stack.enter_context(energy_tracker.track())

with context_stack:
self.logger.info("\t+ Loading model for Inference")
backend.load()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
context_stack = ExitStack()
if self.config.energy:
context_stack.enter_context(energy_tracker.track())
with context_stack:
self.logger.info("\t+ Loading model for Inference")
backend.load()
with energy_tracker.track():
self.logger.info("\t+ Loading model for Inference")
backend.load()

will an energy star ever need an energy argument in its config, since it will always run energy tracking ?


for k in range(self.config.iterations):
self.logger.info(f"\t+ Prefill iteration {k+1}/{self.config.iterations}")
with self.energy_tracker.track(file_prefix="prefill"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might want to add an index to the file prefixes to keep emission csv files from all iterations


prefill_kwargs = {**self.config.generate_kwargs, **TEXT_GENERATION_PREFILL_OVERRIDES}

for k in range(self.config.iterations):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can call it repetitions since its implementation is different from that of iterations in the inference benchmark.

Comment on lines +195 to +198
try:
prefill_volume += inputs["input_ids"].size(dim=1) * self.config.input_shapes["batch_size"]
except KeyError:
prefill_volume += 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this part

self.report.prefill.efficiency = Efficiency.from_energy(
prefill_energy, prefill_volume, unit=TEXT_GENERATION_EFFICIENCY_UNIT
)
self.report.prefill.measures = prefill_measures
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda ambiguous, and I'm not sure how the benchmark report behaves with a list of values ?
would it not be better to have an Energy class that can take multiple measures, or even an EnergyStar class that contains the whole thing

Copy link
Member

@IlyasMoutawwakil IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool, would love to see this finally merged and continuously tested, I think it's being out of sync for long now specifically because we don't have tests for it. Tell me if you need help 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants