Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: fit time is incorrect #1332

Open
DRMPN opened this issue Sep 7, 2024 · 2 comments
Open

[Bug]: fit time is incorrect #1332

DRMPN opened this issue Sep 7, 2024 · 2 comments
Labels
bug Something isn't working core Core logic related to graph optimisation

Comments

@DRMPN
Copy link
Collaborator

DRMPN commented Sep 7, 2024

Expected Behavior

Current Behavior

Green is real time.
Purple or red is calculated(estimated) time.

8 hours run
photo_2024-09-07_23-44-37

5 minutes run
photo_2024-09-07_23-44-41

Possible Solution

Multiplication instead of addition in time estimator I guess.

Steps to Reproduce

Data can be obtained from: https://www.kaggle.com/competitions/playground-series-s4e9/data
Simple notebooks to reproduce the bug:

  • 8 hours notebook:
import pandas as pd
import numpy as np
from fedot.api.main import Fedot

train = pd.read_csv("C:/Users/nnikitin-user/Desktop/automl-september/playground-series-s4e9/train.csv")
test = pd.read_csv("C:/Users/nnikitin-user/Desktop/automl-september/playground-series-s4e9/test.csv")
sub = pd.read_csv("C:/Users/nnikitin-user/Desktop/automl-september/playground-series-s4e9/sample_submission.csv")

train.drop(columns=["id"], inplace=True)
test.drop(columns=["id"], inplace=True)

auto_model = Fedot(
    problem="regression",
    metric=["rmse"],
    preset="best_quality",
    with_tuning=True,
    timeout=480,
    cv_folds=10,
    seed=42,
    n_jobs=1,
    logging_level=10,
    use_pipelines_cache=False,
    use_auto_preprocessing=False,
)

auto_model.fit(features=train, target="price")

auto_model.current_pipeline.save(
    path="C:/Users/nnikitin-user/Desktop/automl-september/run_8hours/saved_pipelines",
    create_subdir=True,
    is_datetime_in_path=True,
)

prediction = auto_model.predict(features=test)

sub["price"] = prediction.ravel()
sub.to_csv("submission.csv", index=False)
  • 30 mins notebook:
import pandas as pd
import numpy as np
from fedot.api.main import Fedot
from fedot.core.pipelines.pipeline_builder import PipelineBuilder

train = pd.read_csv("C:/Users/nnikitin-user/Desktop/automl-september/playground-series-s4e9/train.csv")
test = pd.read_csv("C:/Users/nnikitin-user/Desktop/automl-september/playground-series-s4e9/test.csv")
sub = pd.read_csv("C:/Users/nnikitin-user/Desktop/automl-september/playground-series-s4e9/sample_submission.csv")

train.drop(columns=["id"], inplace=True)
test.drop(columns=["id"], inplace=True)

auto_model = Fedot(
    problem="regression",
    metric=["rmse"],
    preset="best_quality",
    with_tuning=True,
    timeout=5,
    cv_folds=10,
    seed=42,
    n_jobs=1,
    logging_level=10,
    initial_assumption=PipelineBuilder().add_node("lgbmreg").build(),
    use_pipelines_cache=False,
    use_auto_preprocessing=False,
)

auto_model.fit(features=train, target="price")

auto_model.current_pipeline.save(
    path="C:/Users/nnikitin-user/Desktop/automl-september/run_lgbm/saved_pipelines",
    create_subdir=True,
    is_datetime_in_path=True,
)

prediction = auto_model.predict(features=test)

sub["price"] = prediction
sub.to_csv("submission.csv", index=False)

Context [OPTIONAL]

Participating in a Kaggle competition https://www.kaggle.com/competitions/playground-series-s4e9

@DRMPN DRMPN added bug Something isn't working core Core logic related to graph optimisation labels Sep 7, 2024
@aPovidlo
Copy link
Collaborator

aPovidlo commented Sep 8, 2024

@DRMPN А какие значения в n_jobs стояли?

@DRMPN
Copy link
Collaborator Author

DRMPN commented Sep 8, 2024

@DRMPN А какие значения в n_jobs стояли?

n_jobs = 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core Core logic related to graph optimisation
Projects
None yet
Development

No branches or pull requests

2 participants