A collection of post related to my upcoming book “Data Science and Analytics with Python”Take a look and enjoy.
Always wanted to get some data from the web in a programmatic way? Well, check out my recent post in the Domino Data Blog where I discuss how to get data with the help of Beautiful Soup.
The aim is to show how we can create a script that grabs the pages we are interested in and obtain the information we are after. In the post I cover ho to complete the these steps:
- Identify the webpage with the information we need
- Download the source code
- Identify the elements of the page that hold the information we need
- Extract and clean the information
- Format and save the data for further analysis
Are you interested in exploring data using Python? If so, take a look at my this blog post of mine… where I talk about using Pandas Profiler and D-Tale to carry out data exploration.
Helpful steps to:
- Detect erroneous data.
- Determine how much missing data there is.
- Understand the structure of the data.
- Identify important variables in the data.
- Sense-check the validity of the data.
I use the The Mammographic Mass Data Set from the UCI Machine Learning Repository. Information about this dataset can be obtained here.
Read the full blog post in the Domino Data Blog here.”Read
Hello again this is a video I recorded for my publisher about my book “Advanced Data Science and Analytics with Python”. This is a video I made for my publisher about my book “Data Science and Analytics with Python”. You can get the book here and more about the book here.
This companion to "Data Science and Analytics with Python" is the result of arguments with myself about writing something to cover a few of the areas that were not included in that first volume, largely due to space/time constraints. Like the previous book, this one exists thanks to the discussions, stand-ups, brainstorms and eventual implementations of algorithms and data science projects carried out with many colleagues and friends.
As the title suggests, this book continues to use Python as a tool to train, test and implement machine learning models and algorithms. The book is aimed at data scientists who would like to continue developing their skills and apply them in business and academic settings.
The subjects discussed in this book are complementary and a follow-up to the ones covered in Volume 1. The intended audience for this book is still composed of data analysts and early-career data scientists with some experience in programming and with a background in statistical modelling. In this case, however, the expectation is that they have already covered some areas of machine learning and data analytics. The subjects discussed in this book are complementary and a follow-up to the topics discussed in "Data Science and Analytics with Python". Although there are some references to the previous book, this volume is written to be read independently.
I have tried to keep the same tone as in the first book, peppering the pages with some bits and bobs of popular culture, science fiction and indeed Monty Python puns. The aim is still to focus on showing the concepts and ideas behind popular algorithms and their use.
In summary, "Advanced Data Science and Analytics with Python" presents each of the topics addressed in the book tackles the data science workflow from a practical perspective, concentrating on the process and results obtained. The material covered includes machine learning and pattern recognition algorithms including: Time series analysis, natural language processing, topic modelling, social network analysis, neural networks and deep learning. The book discusses the need to develop data products and addresses the subject of bringing models to their intended audiences – in this case, literally to the users’ fingertips in the form of an iPhone app.
I hope you enjoy it and if you want to know more about my other books, please check the related videos here:”Read
The book provides an introduction to some of the most used algorithms in data science and analytics. This book is the result of very interesting discussions, debates and dialogues with a large number of people at various levels of seniority, working at startups as well as long-established businesses, and in a variety of industries, from science to media to finance.
“Data Science and Analytics with Python” is intended to be a companion to data analysts and budding data scientists that have some working experience with both programming and statistical modelling, but who have not necessarily delved into the wonders of data analytics and machine learning. The book uses Python as a tool to implement and exploit some of the most common algorithms used in data science and data analytics today.
Python is a popular and versatile scripting and object-oriented language, it is easy to use and has a large active community of developers and enthusiasts, not to mention the richness oall of this helped by the versatility of the iPython/Jupyter Notebook.
In the book I address the balance between the knowledge required by a data scientist sucha as mathematics and computer science, with the need for a good business background. To tackle the prevailing image of a unicorn data scientist, I am convinced that the use of a new symbol is needed. And a silly one at that! There is an allegory I usually propose to colleagues and those that talk about the data science Unicorn. It seems to me to be a more appropriate one than the existing image: It is still another mythical creature, less common perhaps than the unicorn, but more importantly with some faint fact about its actual existence: a Jackalope. You will have to read the book to find out more!
The main purpose of the book is to present the reader with some of the main concepts used in data science and analytics using tools developed in Python such as Scikit-learn, Pandas, Numpy and others. The book is intended to be a bridge to the data science and analytics world for programmers and developers, as well as graduates in scientific areas such as mathematics, physics, computational biology and engineering, to name a few.
The material covered includes machine learning and pattern recognition, various regression techniques, classification algorithms, decision tree and hierarchical clustering, and dimensionality reduction. Though this text is not recommended for those just getting started with computer programming,
There are a number of topics that were not covered in this book. If you are interested in more advanced topics take a look at my book called “Advanced Data Science and Analytics with Python”. There is a follow up video for that one! Keep en eye out for that!
Related Content: Please take a look at other videos about my books:”Read
This survey paper extracts practical considerations from recent case studies of a variety of ML applications and is organized into sections that correspond to stages of a typical machine learning workflow: from data management and model learning to verification and deployment.
In recent years, machine learning has received increased interest both as an academic research field and as a solution for real-world business problems. However, the deployment of machine learning models in production systems can present a number of issues and concerns. This survey reviews published reports of deploying machine learning solutions in a variety of use cases, industries and applications and extracts practical considerations corresponding to stages of the machine learning deployment workflow. Our survey shows that practitioners face challenges at each stage of the deployment. The goal of this paper is to layout a research agenda to explore approaches addressing these challenges.
Right!!! It is early December and this post has been in the inkwell for a few months now. Earlier in the year I received the comments and suggestions from reviewers and the final approval from the excellent team at CRC Press for my 4th book.
After a few weeks of frank procrastination and a few more on structuring the thoughts proposed a bit more, I have got a clear head to start writing. So I am pleased to announce that I am officially starting to write “Statistics and Data Visualisation with #Python”.
"Statistics and Data Visualisation with Python" builds from the ground up the basis for statistical analysis underpinning a number of applications and algorithms in business analytics, machine learning and applied machine learning. The book will cover the basics of programming in python as well as data analysis to build a solid background in statistical methods and hypothesis testing useful in a variety of modern applications.
I was not expecting this today, but I am very pleased to see that my first physical copies of "Advanced Data Science and Analytics" have arrived. I was working under the assumption that these would not be sent until after lockdowns were lifted, but that was not the case.
I am very happy to see the actual book and hold it in my hands!
I also hear that individual copies have started arriving to their new owners. If you ordered yours, let me know when it arrives. I will post your pictures!”Read
It is official! "Advanced Data Science and Analytics with Python" is published!
According to the information I had received from CRC Press, my publisher, the book would be available on May 7th. According to the official page of the book the volume was available since May 5th.
Looking forward to hearing what you think of the book.”Read
I am reaching out as volume 2 of my data science book will be out for publication in May and my publisher has made it possible for me to offer 20% off. You can order the book here.
This follows from "Data Science and Analytics with Python" and both books are intended for practitioners in data science and data analytics in both academic and business environments.
The new book aims to present the reader with concepts in data science and analytics that were deemed to be more advanced or simply out of scope in the author's first book, and are used in data analytics using tools developed in Python such as SciKit Learn, Pandas, Numpy, etc. The use of Python is of particular benefit given its recent popularity in the data science community. The book is therefore a reference to be used by seasoned programmers and newcomers alike and the key benefit is the practical approach presented throughout the book
More information about the first book can be found here.”Read
Well, this are the final corrections for my latest book "Advanced Data Science and Analytics with Python". Next stop publication!
Super excited to have received the proofread version of Advanced Data Science and Analytics with Python. They all seem to be very straightforward corrections: a few missing commas, some italics here and there and capitalisation bits and bobs.
I hope to be able to finish the corrections before my deadline for March 25th, and then enter the last phase before publication in May 2020.”Read
I have received the latest information about the status of my book “Advanced Data Science and Analytics with Python”. This time reviewing the latest cover drafts for the book.
This is currently my favourite one.
Awaiting the proofreading comments, and I hope to update you about that soon.”Read
If you are interested in #DataScience you surely have heard of #pandas and you would be pleased to hear that version 1.0 finally out. With better integration with bumpy and improvements with numba among others. Take a look!
— Read on www.anaconda.com/pandas-1-0-is-here/
It was great to invited to give the joint Physics Astronomy and Maths + Computer Science research seminar today at the University of Hertfordshire. I had a good opportunity to meet old colleagues and meet new faculty. There were also many students and they with many questions.
I was glad to hear they are thinking about offering more data science courses and even a dedicated programme. I would definitely be interested to hear more about that.”Read
There you go, the first checkpoint is completed: I have officially submitted the completed version of "Advanced Data Science and Analytics with Python".
The book has been some time in the making (and in the thinking...). It is a follow up from my previous book, imaginatively called "Data Science and Analytics with Python" . The book covers aspects that were necessarily left out in the previous volume; however, the readers in mind are still technical people interested in moving into the data science and analytics world. I have tried to keep the same tone as in the first book, peppering the pages with some bits and bobs of popular culture, science fiction and indeed Monty Python puns.
Advanced Data Science and Analytics with Python enables data scientists to continue developing their skills and apply them in business as well as academic settings. The subjects discussed in this book are complementary and a follow up from the topics discuss in Data Science and Analytics with Python. The aim is to cover important advanced areas in data science using tools developed in Python such as SciKit-learn, Pandas, Numpy, Beautiful Soup, NLTK, NetworkX and others. The development is also supported by the use of frameworks such as Keras, TensorFlow and Core ML, as well as Swift for the development of iOS and MacOS applications.
The book can be read independently form the previous volume and each of the chapters in this volume is sufficiently independent from the others proving flexibiity for the reader. Each of the topics adressed in the book tackles the data science workflow from a practical perspective, concentrating on the process and results obtained. The implementation and deployment of trained models are central to the book
Time series analysis, natural language processing, topic modelling, social network analysis, neural networds and deep learning are comprehensively covrered in the book. The book discusses the need to develop data products and tackles the subject of bringing models to their intended audiences. In this case literally to the users fingertips in the form of an iPhone app.
While the book is still in the oven, you may want to take a look at the first volume. You can get your copy here:
Furthermore you can see my Author profile here.”Read