In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
Editor to share with you how to deal with the exception of Python Pandas reading Excel date data. I hope you will get something after reading this article. Let's discuss it together.
Exception description
Sometimes our Excel has a date field with custom formatting:
When we read with pandas, it looks like this:
No matter how you specify the parameter, it is not valid.
Cause of occurrence
The system does not use the built-in date cell format, the custom format does not define the negative format, pandas does not recognize the date format when reading, but reads out the actual value stored in the cell.
Solution: modify custom format
You can modify it to a custom format built into the system:
Or add the definition of negative numbers to the custom format:
Add; @ is fine
Pandas directly parses the Excel value as a date
Sometimes there are so many Excel that we need to read them in batches through pandas, and it takes time and effort to manually modify the custom format of Excel one by one. I'll show you how to parse these values directly into a date format using pandas.
The conversion rules for general format and date format in excel are as follows:
1900-1-1 is the start date, the conversion number is 1, and increase by 1 for each subsequent day.
The conversion to a number on 1900-1-2 is 2
The conversion to the number on 1900-1-3 is 3
Conversion to number on 1900-1-4 is 4
and so on
The time conversion rules in excel are as follows:
The rule in time is to convert one day into a number one.
Every hour is one stroke 24.
Every minute is 1 / (24 × 60) = 1 stroke 1440
Every second is 1 / (24 × 60 × 60) = 1max 86400
According to Excel's date storage rules, we only need to take 1900-1-1 as the base date and offset NMUI for 1 day according to the value n to get the actual date. However, there is another problem. Excel stores February 29th in 1900, but there is no such day in normal calendars, and our dates are all greater than 1900, so we should shift NMQ 2 days and simply use December 30th in 1899 as the benchmark, so there is no need for subtraction.
The parsing code is as follows:
Import pandas as pdfrom pandas.tseries.offsets import Daydf = pd.read_excel ("date .xlsx") basetime = pd.to_datetime ("1899-12-30") df. Date = df. Date. Apply (lambda x: basetime+Day (x)) df. Date = df. Date .apply (lambda x: F "{x.month} month {x.day} day") df.head () date 0June 8, 16 June 9, 26 June 10, 3 June 11, 4 June 12
If you need to call the strftime method of time, you need to set locale because it contains Chinese:
Import pandas as pdfrom pandas.tseries.offsets import Dayimport localelocale.setlocale (locale.LC_CTYPE, 'chinese') df = pd.read_excel ("date .xlsx") basetime = pd.to_datetime ("1899-12-30") df. Date = df. Date. Apply (lambda x: basetime+Day (x)) df. Date = df. Date .dt.strftime ('% Y% m / d') df.head () date 02021 / 06.8 12021 / 09.22021 / 010,32021 / 011,42021 / 12 / 06, I believe you have a certain understanding of "abnormal handling of Python Pandas reading Excel date data". If you want to know more about it, please follow the Industry Information Channel and thank you for your reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.