Removing outliers from the 'price' and 'extra_people' columns

The first outliers to be removed will be from the 'price' and 'extra_people' collumns as they have the highest relevance in calculating the final price and they are continues numerical values (can be measured).

for collumn in ['price', 'extra_people'] :
    main_dataframe, amount_removed_lines = exclude_outliers(main_dataframe, collumn)

    print(f"{amount_removed_lines} lines were removed from {collumn}")
    ###histogram(main_dataframe[collumn])
    ###box_plot(main_dataframe[collumn])

1712967427666

Nearly 10% of the lines were removed from the 'price' column

Obs: when the price collumn is a integer, the quantity of apparments increases because landlords usally put their price as a whole value.