In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article will explain in detail how to predict the rental price of AirBnB in New York City with TensorFlow. The content of the article is of high quality, so Xiaobian shares it with you for reference. I hope you have a certain understanding of relevant knowledge after reading this article.
introduced
Airbnb is an online marketplace that allows people to rent out their properties or spare rooms to guests. 12% and 6% commission for every 3 guests booked.
Since its founding in 2009, the company has grown from helping 21,000 guests find accommodation each year to helping 6 million people vacation each year, and currently lists an astonishing 800,000 properties in 34000 cities in 90 different countries.
I'm going to use the Kaggle-newyork cityairbnb open dataset to try to build a neural network model with TensorFlow to make predictions.
The goal is to build a suitable machine learning model capable of predicting the price of future accommodation data.
I'm going to show you the Jupyter Notebook that I created. You can find it on GitHub: github.com/Timothy102/Tensorflow-for-Airbnb-Prices
load data
First, let's look at how to load data. We use wget to get data directly from Kaggle. Note that the-o flag indicates the file name.
The dataset should look like this. There are 48895 rows and 16 columns.
Data analysis and preprocessing
Seaborn has a very concise API that can draw various graphs for various data. If you're not familiar with syntax, check out this article: www.analyticsvidhya.com/blog/2019/09/comprehensive-data-visualization-guide-seaborn-python/
After using corr on the pandas data frame, we pass it to a heatmap function. The results were as follows:
Since we have longitude and longitude and neighborhood data, let's create a scatterplot:
Also, I removed duplicates and unnecessary columns and filled in "reviews_per_month" because it had too many missing values. The data looks like this. It has 10 columns and no zero values:
That's good, right?
First of all, computers do numbers. This is why we want to convert the categorical column into a vector of one-hot codes. This is done using the factorize method of pandas. There are many other tools you can use:
To keep the loss function in a stable range, let's normalize some data so that the mean is 0 and the standard deviation is 1.
feature crossing
We have to make a change, and it's an essential change. To associate longitude and latitude with the model output, we must create a feature intersection. The links below should provide you with enough background to properly feel the feature crossover:
https://developers.google.com/machine-learning/crash-course/feature-crosses/video-lecture
https://www.kaggle.com/vikramtiwari/feature-crosses-tensorflow-mlcc
Our goal is to introduce longitude and latitude crossing, one of the oldest techniques in the book. If we put only these two columns into the model as values, it assumes that these values are progressively related to the output.
Instead, we'll use feature crossings, which means we'll split the longitude * longitude map into a grid. Fortunately, TensorFlow makes it easy.
I iterate (max-min)/100 to generate a grid of evenly distributed frames.
I used a 100×100 grid:
Essentially, what we're doing here is defining a bucked column and the boundaries defined earlier, and creating a DenseFeatures layer that we'll pass to the Sequential API.
If you are unfamiliar with Tensorflow syntax, check the documentation: www.tensorflow.org/api_docs/python/tf/feature_column/
Now, finally, we are ready for model training. Except for splitting the data parts, that is.
Obviously, we have to create two datasets, one containing all the data and the other containing the predictive scores. Because of the data size mismatch, which could cause problems for our model, I decided to truncate data that was too long.
create a model
Finally, Keras sequence model is established.
We compile the model using Adam Optimizer, Mean Square Error Loss, and two metrics.
In addition, we use two callbacks:
Stop early, it goes without saying
Reduce altitude learning rate.
After 50 epochs of training and a batch size of 64, our model is quite successful.
We used AirBnB data from New York City to build a fully connected neural network to predict future prices. Pandas and seaborn make it easy to visualize and examine data. We introduce the idea of latitude and longitude crossing as a feature into the model. And thanks to Kaggle's open dataset, we have a fully operational machine learning model.
About how to use TensorFlow to predict the rental price of AirBnB in New York City, I hope the above content can be helpful to everyone and learn more. If you think the article is good, you can share it so that more people can see it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.